[TIKA-2750] Update regression corpus - Tika - [issue]
...I think we've had great success with the current data on our regression corpus.  I'd like to re-fresh some data from common crawl with three primary goals:1) include more interesting do...    Author: Tim Allison , 2018-10-05, 13:40
updating data on the regression corpus - Tika - [mail # dev]
...All,  I opened to trackupdating data on the regression corpus.  Please track/join theconversation there if you'd like to participate...
   Author: Tim Allison , 2018-10-05, 13:30
[expand - 1 more] - Welcome to the regression vm! - Tika - [mail # dev]
...Tobias,  I just gave you access to the vm and sent login stuff to youpersonally.  I have to update some groups and permissions, but I'lllet you know when that is ready.  Let m...
   Author: Tim Allison , 2018-10-05, 13:18
[TIKA-2679] Bump 1.x branch to Java 1.8 - Tika - [issue]
...As we've been warning.  We can revert this if anyone objects....    Author: Tim Allison , 2018-10-05, 10:54
[TIKA-2745] Upgrade to PDFBox 2.0.12 when available - Tika - [issue]
...Voting for rc1 started yesterday....    Author: Tim Allison , 2018-10-04, 23:07
[TIKA-2478] RFC822 includes redundant copies of the text - Tika - [issue]
...MBOX messages often get parsed into four documents:a. The mbox file - outer container "/"b. The actual email--  "/embedded-1"c. The utf-8 text content of the email "/embedded-1/embedded...    Author: Robert Letzler , 2018-10-03, 23:02
[expand - 3 more] - max files parameter question for Tika Server - Tika - [mail # user]
...Hello,Thanks for the quick fix !I will do more tests tomorrow with Tika server. I will let you know if I find something else.  Best regards,Olivier> Le 3 oct. 2018 à 21:29, Tim Allis...
   Author: Olivier Tavard , Tim Allison , ... , 2018-10-03, 19:48
[TIKA-2748] trivial tika-server bug w -maxFiles in new -spawnChild mode - Tika - [issue]
...options.addOption("maxFiles", false, "Only in spawn child mode: shutdown server after this many files -- use only in 'spawnChild' mode");false->true...    Author: Tim Allison , 2018-10-03, 19:27
[TIKA-2646] Tika parse["content"] returns jumbled text across cells of a table in a pdf - Tika - [issue]
...When text from a table is extracted, sometimes the order of the cells becomes mixed and the words get concatenated together. For example: HOURSDUR(hr)PHASECODESUBDESCRIPTIONbecomes: Hours Du...    Author: Annie Didier , 2018-10-03, 16:26
[TIKA-2249] Tika not able to parse tables from pdf - Tika - [issue]
...Tika not able to parse tables from pdf. I want to attach sample pdf which I tried but attachment/browse link is not visible to me....    Author: Amit Kumar , 2018-10-03, 16:26