clear query| facets| time Search criteria: author:"Tim Allison".   Results from 1 to 10 from 109 (0.0s).
Loading phrases to help you
refine your search...
[TIKA-2705] Allow configuration of TesseractOCRParser as we do for other parsers - Tika - [issue]
...It would be handy to be able to configure tesseract via our regular tika-config set up....
http://issues.apache.org/jira/browse/TIKA-2705    Author: Tim Allison , 2019-10-21, 13:30
[TIKA-2851] Upgrade to POI 4.1.1 when available - Tika - [issue]
...There were some regressions in POI's recent 4.1.0 release in EMF/WMF handling.  Unless there are other higher priority reasons to upgrade to 4.1.0, I propose we wait for 4.1.1. My apolo...
http://issues.apache.org/jira/browse/TIKA-2851    Author: Tim Allison , 2019-10-21, 21:19
[TIKA-2779] Integrate/parameterize new rotated text handling in PDFBox - Tika - [issue]
...PDFBOX-4371 ... thank you Tilman Hausherr!...
http://issues.apache.org/jira/browse/TIKA-2779    Author: Tim Allison , 2019-10-08, 17:35
[TIKA-2967] Handle digital signature data uniformly across at least PDF and ooxml - Tika - [issue]
...There are some inconsistencies in how we handle digital signature data between PDF and ooxml.   My sense is that this info belongs in the metadata, not in the xhtml (even if clearly mar...
http://issues.apache.org/jira/browse/TIKA-2967    Author: Tim Allison , 2019-10-17, 12:36
[TIKA-2965] Add a metadata flag for XFA and XMP in PDFs - Tika - [issue]
...It would be useful to be able to determine which PDFs in a given collection contain XFA and/or XMP.  Let's add a metadata flag for those embedded files....
http://issues.apache.org/jira/browse/TIKA-2965    Author: Tim Allison , 2019-10-17, 19:36
[TIKA-3049] Improve file detection...varia - Tika - [issue]
...I recently crawled a few bugzilla issue trackers to add files to our regression corpus.  I noticed that bugzilla is able to identify the mime types of a few file types that we're not, a...
http://issues.apache.org/jira/browse/TIKA-3049    Author: Tim Allison , 2020-02-20, 21:32
[TIKA-3026] Consider extracting structure/tags where possible in PDFs with the PDFMarkedContentExtractor - Tika - [issue]
...Some PDFs contain tags that may be useful in understanding the structure of the elements within a PDF, e.g. table markup, paragraph breaks, headers, etc.    The quality of the tags depends e...
http://issues.apache.org/jira/browse/TIKA-3026    Author: Tim Allison , 2020-02-24, 19:02
[TIKA-3050] Add xmp extraction to psd files - Tika - [issue]
http://issues.apache.org/jira/browse/TIKA-3050    Author: Tim Allison , 2020-02-25, 08:22
[TIKA-3047] Upgrade to POI 4.1.2 - Tika - [issue]
...Now available at a maven repo near you!  Thank you Andreas Beeker for running the release!...
http://issues.apache.org/jira/browse/TIKA-3047    Author: Tim Allison , 2020-02-25, 08:22
[TIKA-3033] Upgrade to PDFBox 2.0.19 when available - Tika - [issue]
http://issues.apache.org/jira/browse/TIKA-3033    Author: Tim Allison , 2020-02-25, 08:22