[TIKA-1019] Document links in Word documents don't leave a placeholder - Tika - [issue]    Author: Michael McCandless , 2012-11-12, 11:29
[TIKA-1024] An MP3 with an UTF-16 ID3 tag containing only the BOM should produce empty string value for that tag - Tika - [issue]
...This seems to be a difference between JVMs: on IBM's JVM I incorrectly see the BOM as the value of the tag, while on Oracle's JVM I correctly get the empty string.I'm not sure if this is a b...    Author: Michael McCandless , 2012-11-18, 15:53
[TIKA-1025] Powerpoint (.ppt) parser doesn't leave placeholder where documents are embedded - Tika - [issue]    Author: Michael McCandless , 2012-11-18, 16:11
[TIKA-1031] TikaCLI doesn't create sub-dirs when extracting Zip files - Tika - [issue]    Author: Michael McCandless , 2012-12-01, 17:53
[TIKA-1032] Powerpoint (.pptx) can have duplicate embedded ids - Tika - [issue]
...Apparently the relId is only unique within one slide ... I fixed it to prefix slideN_....    Author: Michael McCandless , 2012-12-01, 17:57
[TIKA-1033] Tika doesn't parse embedded OLE Chart/Graph objects - Tika - [issue]
...I have an example ppt that embeds a chart, but Tika mis-identifies itas an XLS document.The progID (oleShape.getProgID() inHSLFExtractor.handleSlideEmbeddedResources) is MSGraph.Chart.8 ... ...    Author: Michael McCandless , 2016-04-07, 13:51
[TIKA-1035] PDF bookmark text is not extracted - Tika - [issue]    Author: Michael McCandless , 2012-12-01, 18:05
[TIKA-1036] ZIP parsing doesn't leave placeholders for each package entry - Tika - [issue]    Author: Michael McCandless , 2012-12-01, 18:07
[TIKA-948] Embedded PDF extracted incorrectly as MS Works file from Word 97-2003 doc - Tika - [issue]
...This is just like TIKA-704, except that issue was for an OOXML Worddoc but this is for the older Word 97-2003 format....    Author: Michael McCandless , 2012-08-09, 17:42
[TIKA-956] Embedded docs in Word doc are not inlined (text is always added to the end) - Tika - [issue]
...You can see this with the recently added testWORD_embedded_pdf.doc(for TIKA-948): the "Bye Bye" text comes before the "Werwjelrwoierj..." text from the embedded PDF, opposite of what you see...    Author: Michael McCandless , 2012-08-07, 21:42