Sorry, it’s an error. I need the text content of PDF, txt and doc docx to
index in solr.


Thanks for your help.



De : msaunier [mailto:[EMAIL PROTECTED]]
Envoyé : vendredi 5 janvier 2018 18:05
Objet : OCR Tika to read PDF, txt and doc docx




How can I used/install an OCR to extract the content_html in files with
ManifoldCF ?

I need the HTML content.


Thanks for your help,