Subject: Wrong language detection in tika server 1.22


In looking at the source code for this (for the first time?)...it looks
like that endpoint expects UTF-8 text.  It does not parse the file and then
run lang id on the parsed text.

On Thu, Dec 5, 2019 at 6:43 AM Juan Elosua <[EMAIL PROTECTED]> wrote: