[TIKA-2559] Expose language metadata from PDF documents - Tika - [issue]
...Tika does not currently return the language from a PDF's metadata (for an example PDF I'm seeking permission to share with you - Perhaps for all PDFs).It would be useful to me (and I imagine...    Author: Matt Sheppard , 2018-02-07, 18:02
[TIKA-1730] Excel to HTML filtering seems to produce some font setting gibberish in output - Tika - [issue]
...Noticed while upgrading form Tika 1.8 to 1.10 - An .xls file linked below, which used to filter pretty normally, now produces the following...<div class="outside">&amp;C&amp;"A...    Author: Matt Sheppard , 2016-01-04, 15:29
[TIKA-1590] A particular PDF seems to trigger an infinite loop when being converted to HTML - Tika - [issue]
...The PDF at,_292_KB.pdf (which I'll also attach) appears to trigger an infinite loop (or ...    Author: Matt Sheppard , 2015-04-02, 11:37
[TIKA-1174] Invalid characters in filtered PDF output - Tika - [issue]
...The PDF document at produces invalid characters in the output when filtered by Tika 1.4.>/opt/funnelback/mbin...    Author: Matt Sheppard , 2015-03-15, 21:01
[TIKA-911] Converted PDF document contains question marks in place of spaces and inconsistent case - Tika - [issue]
...The PDF document at, when converted with tika v1.1 using$ java -jar tika-app-1.1.jar Rust\ Biosecurity\ Brochure.pd...    Author: Matt Sheppard , 2015-03-02, 04:26
[TIKA-621] RTF parsing fails with Java 7 early access on 64bit platforms - Tika - [issue]
...I've run across an RTF documents which tika is failing to convert on 64bit platforms (Windows and Linux) using the Java 7 early access version. The same document is successfully converted on...    Author: Matt Sheppard , 2011-10-20, 12:34