clear query| facets| time Search criteria: *:*.   Results from 1 to 10 from 3199 (0.0s).
Loading phrases to help you
refine your search...
[TIKA-3094] Apache Tika fails to extract text for pptx extension. - Tika - [issue]
...This is regressed from 1.23 version of Apache Tika. Text extraction for .pptx ententions which was earlier working with Apache Tika 1.23 is no longer working in 1.24 version.For .ppt extenti...
http://issues.apache.org/jira/browse/TIKA-3094    Author: Abhishek Chauhan , 2020-09-30, 17:45
[TIKA-3206] commons-io : 2.6, which is a transitive dependency of tika is vulnerable to "sonatype-2018-0705". - Tika - [issue]
...Tika has embedded commons-io.2.6.jar which is vulnerable to  "sonatype-2018-0705".ISSUEsonatype-2018-0705SEVERITYSonatype CVSS 3:7.8CVE CVSS 2.0:0.0 EXPLANATIONThe commons-io package is vuln...
http://issues.apache.org/jira/browse/TIKA-3206    Author: Ankush Rana , 2020-09-30, 17:41
[TIKA-3044] add -C/--content cli option using WriteOutContentHandler - Tika - [issue]
...For text extraction, the cli currently provides both --text and --text-main options. For html files, --text will return the body, while --text-main will only return the title. There is curre...
http://issues.apache.org/jira/browse/TIKA-3044    Author: Alexander Klimetschek , 2020-09-30, 17:38
[TIKA-3205] Mime magic for more certificate related formats - Tika - [issue]
...As spotted by a Tika user on stackoverflow <https://stackoverflow.com/q/64119284/685641>, we only have mime magic for a handful of the certificate/key related formats, and are missing ...
http://issues.apache.org/jira/browse/TIKA-3205    Author: Nick Burch , 2020-09-30, 17:31
[TIKA-3196] PackageParser should attempt to parse entries from zip files with STORED entries with data descriptor - Tika - [issue]
...We are currently using tika for text extraction. Currently some sites are returning zips that have entries with stored data descriptors which fail to extract due to the ZipArchiveInputStream...
http://issues.apache.org/jira/browse/TIKA-3196    Author: Trevor Bentley , 2020-09-29, 17:45
[TIKA-2518] tika app outputs warnings by default - Tika - [issue]
...upon downloading the latest tika and trying basic commands it spews unwanted warnings, which makes parsing output necessary.Example 1:java -jar tika-app-1.16.jar --list-detectorsDec 05, 2017...
http://issues.apache.org/jira/browse/TIKA-2518    Author: Ryan Brueske , 2020-09-28, 04:32
[TIKA-3204] License incompliance with xmp-core 6.1.10 - Tika - [issue]
...Apache Tika 1.24.1 (and probably also oder versions) has a dependency to xmp-core 6.1.10. Usage of this dependency is incompliant with its license, because distribution of xmp-core is strict...
http://issues.apache.org/jira/browse/TIKA-3204    Author: Christian Seipel , 2020-09-24, 18:26
[TIKA-3202] Tika duplicates the ocr text - Tika - [issue]
...I m using tika 1.24.1 together with tesseract from docker image apache/tika:1.24-fullThe header X-Tika-PDFocrStrategy: OCR_AND_TEXT occurs the issuethe output from pdf processing is duplicat...
http://issues.apache.org/jira/browse/TIKA-3202    Author: marek kapowicki , 2020-09-23, 05:54
[TIKA-3203] MP4Parser temporary files are not deleted from Tomcat temp folder - Tika - [issue]
...In our application, Tika is used as part of a Tomcat webapp.  Tomcat sets its temp folder ($CATALINA_HOME/temp) as "java.io.tmpdir".  The MP4Parser creates files in java.io.tmpdir....
http://issues.apache.org/jira/browse/TIKA-3203    Author: Isabelle Giguere , 2020-09-23, 01:24
[TIKA-3200] wrong language ("tr" instead of "ru") is assigned for recognized charset "windows-1251" - Tika - [issue]
...CharsetDetector is detecting windows_1251 using detector  org.apache.tika.parser.txt.CharsetRecog_windows_1251 This detector is creating CharsetMatch with language "tr", but should be "ru". ...
http://issues.apache.org/jira/browse/TIKA-3200    Author: Alexey Lukashov , 2020-09-22, 03:36