[expand - 3 more] - regression tests for 1.23-rc1 - Tika - [mail # dev]
...All,New reports are here: ran these with the most recent 1.23-SNAPSHOT on the full 500k sample.There are a few things to ...
   Author: Tim Allison , 2019-11-26, 17:30
[expand - 1 more] - Parsing files on a remote server - Tika - [mail # user]
...Thank you, David!  I heartily second this recommendation: please do notreinvent the wheel!On Tue, Nov 26, 2019 at 6:13 AM David Pilato  wrote:> You could have a look at FSCrawle...
   Author: Tim Allison , 2019-11-26, 14:44
[TIKA-3000] Users should be able to configure POI's IOUtils.setByteArrayMaxOverride - Tika - [issue]    Author: Tim Allison , 2019-11-26, 11:33
[TIKA-2999] PDFParser should set, not add digital signature value - Tika - [issue]    Author: Tim Allison , 2019-11-26, 11:33
Token Coordinates at Image - Tika - [mail # user]
...Hi Furkan,  First, are you processing PDFs or actual image files?  If PDFs, becareful about blacking out images because there may be some record of theunderlying text in the file, ...
   Author: Tim Allison , 2019-11-25, 15:02
[LUCENE-5317] Concordance/Key Word In Context (KWIC) capability - Lucene - [issue]
...This patch enables a Lucene-powered concordance search capability.Concordances are extremely useful for linguists, lawyers and other analysts performing analytic search vs. traditional snipp...    Author: Tim Allison , 2019-11-24, 18:15
[TIKA-2998] Allow users to extract font names in PDFs - Tika - [issue]    Author: Tim Allison , 2019-11-23, 16:53
[TIKA-2966] Create a tika-eval SAXHandler - Tika - [issue]
...One of the improvements coming in 1.23 is the decoupling of the text stats calculator from the tika-eval app.  To make this even easier to use, let's add a handler that will calculate t...    Author: Tim Allison , 2019-11-22, 20:13
[TIKA-2997] Add embedded depth as a metadata field populated by RecursiveParserWrapperHandler - Tika - [issue]    Author: Tim Allison , 2019-11-22, 18:49
[expand - 1 more] - [EXTERNAL] Docker image along with 1.23? - Tika - [mail # dev]
...K.  Sounds like an example Docker file will meet your needs, Eric?Users can currently build their own images with the Docker file intika-server, and there's logical-spark.As noted, ther...
   Author: Tim Allison , 2019-11-21, 13:02