Subject: Wrong language detection in tika server 1.22

In looking at the source code for this (for the first time?) looks
like that endpoint expects UTF-8 text.  It does not parse the file and then
run lang id on the parsed text.

