clear query| facets| time Search criteria: .   Results from 1 to 10 from 814 (0.0s).
Loading phrases to help you
refine your search...
[expand - 1 more] - Error when parsing of Excel files - Tika - [mail # user]
...> as for files, in my case they are from customer and I don't want to sharethem.https://corpora.tika.apache.org/datasette/corpora-metadata?sql=select+file_path%2C+orig_stack_trace%0D%0Afr...
   Author: Tim Allison , 2020-07-29, 16:35
[TIKA-3147] Strip punctuation in lang id component within tika-eval - Tika - [issue]
...I noticed that "the quick brown fox jumped over the lazy dog" was identified as English in tika-eval.  However, if I added semi-colons, it was identified as Chinese.This is in alignment...
http://issues.apache.org/jira/browse/TIKA-3147    Author: Tim Allison , 2020-07-27, 20:58
[TIKA-3145] Add a content digester to tika-eval text stats - Tika - [issue]
...When comparing files, it can be useful to digest the text contents so that users can identify files that may have duplicate content but different overall digests.  Let's add a content d...
http://issues.apache.org/jira/browse/TIKA-3145    Author: Tim Allison , 2020-07-24, 22:43
[TIKA-3146] Add Nutch's TextProfileSignature digest to tika-eval's text stats - Tika - [issue]
...https://github.com/apache/nutch/blob/master/src/java/org/apache/nutch/crawl/TextProfileSignatureWill require trivial modifications to work within the tika-eval context.  As with TIKA-31...
http://issues.apache.org/jira/browse/TIKA-3146    Author: Tim Allison , 2020-07-24, 22:43
Tika extract images - Tika - [mail # dev]
...Which endpoint are you using?On Wed, Jul 22, 2020 at 1:36 PM Tilman Hausherr wrote:> What happens when you try to do the same with tika-app from the command> line?>> Tilman>&g...
   Author: Tim Allison , 2020-07-22, 23:56
[TIKA-3143] Enable custom resources and writers in tika-server - Tika - [issue]
...We've put in a fair amount of work into the configuration and robustness of tika-server.  I think it would be useful to enable users and even other modules within Tika to add custom han...
http://issues.apache.org/jira/browse/TIKA-3143    Author: Tim Allison , 2020-07-20, 17:09
[TIKA-3140] Add a metadata filter for tika-eval - Tika - [issue]
...If we go forward with TIKA-3137, it would be useful to add a metadatafilter for tika-eval's text stats, including the junk detector....
http://issues.apache.org/jira/browse/TIKA-3140    Author: Tim Allison , 2020-07-17, 20:24
[TIKA-3142] Update Jenkins for main branch, maybe turn on more modern jdks - Tika - [issue]
...I think I did this for Tika-trunk.  I modified our jdk7 to jdk11 on tika-master.  Once Tika-trunk completes, I'll rename it to tika-master-jdk8 unless there are objections.What els...
http://issues.apache.org/jira/browse/TIKA-3142    Author: Tim Allison , 2020-07-17, 19:35
[TIKA-3137] Enable a metadata filter for the RecursiveParserWrapper - Tika - [issue]
...The RecursiveParserWrapper is designed to extract all metadata from every embedded file.  Some users may need more targeted ways of filtering the metadata to save on resources, e.g. mem...
http://issues.apache.org/jira/browse/TIKA-3137    Author: Tim Allison , 2020-07-17, 19:26
[TIKA-3073] Add gzip in- and out- interceptors to tika-server - Tika - [issue]
...On TIKA-3069, Carina Antunes requested compressing /rmeta output. This makes sense as a start...we might also look into allowing more configurability around which metadata fields and file ty...
http://issues.apache.org/jira/browse/TIKA-3073    Author: Tim Allison , 2020-07-16, 20:23