Solr 1.4 has Tika built in, so it can handle Word, PDF, etc files. I keep forgetting the basic submission line. Here it is - {{{ $ cd apache-solr-1.4.0/example $ curl "http://localhost:8983/solr/update/extract?literal.id=doc5&commit=true" \ --data-binary @/path/to/file.pdf -H 'Content-type:application/pdf' }}} Where {{doc5}} is a unique id. {{commit=true}} commits immediately so you can search for it. Without setting the content type, it will default to xml and fail with "missing content stream". ---- [Information Retrieval |CategoryArchived.Computing.InformationRetrieval]