Yes, there is a big reason. It’s b/c you don’t have to have an external
server running to use it with tika-dl. And of course you can static analyze
the code (which you have to mix languages for that with the other solution), etc.


So yes, we should keep them both…





From: Tim Allison <[EMAIL PROTECTED]>
Date: Friday, July 6, 2018 at 4:30 PM
Subject: Re: image do the parts play together?


This is very helpful. Thank you! Is there any use in having the tika-dl

module if our more modern approach is REST + Docker? The upkeep in tika-dl

is nontrivial.


On Fri, Jul 6, 2018 at 6:15 PM Chris Mattmann <[EMAIL PROTECTED]> wrote:






Thanks. There are multiple modes of integrating deep learning with Tika:



The original mode: uses Thamme’s work on REST exposing Tensorflow

and Docker to provide a REST Service to Tika to allow for running


DL models. We initially did Inception_v3, and a model by Madhav Sharan

that combines OpenCV

with Inception v3 (and a new docker that installs OpenCV it’s a pain) for


and video object recognition, respectively. See:

and and also the wiki

Later, Thamme, Avtar Singh, KranthiGV, added DL4J support:

including Inceptionv3 and VGG16 -

This houses the model in USC Data science repo and uses it as an example

for how to store and load models from Keras/Python into DL4j:

Then, Thejan added Text Captioning and a new Docker, and trained model:

Then Raunaq from UPenn added Inception v4 support via the

Docker/Tensorflow way:

All this Docker work caused Thejan and others to think we needed to

refactor the dockers. We did

that here: to make them cleaner,

and to depend on: and on

models for image captioning. Now, Video and Image recognition and Image

Captioning all had the same

base docker and sub dockers from that.



That’s where we’re at today. Make sense? ☺ Thejan and others want to add

more DL4J supported models

and we can always use Tensorflow/Docker as well as a way of doing it.
















From: Tim Allison <[EMAIL PROTECTED]>


Date: Friday, July 6, 2018 at 2:39 PM


Subject: image do the parts play together?




On Twitter, Chris, Thamme, Thejan, and I are working with some


deeplearning4j devs to help us upgrade to deeplearning4j 1.0.0-BETA






I initially requested help from Thejan (and Thamme :D) for this because we


were getting an initialization exception after the upgrade in tika-dl's






According to our wiki[2], we upgraded to InceptionV4 in Tika-2306 by adding


the TensorFlowRESTRecogniser...does this mean we can get rid of


DL4JInceptionV3Net?  Or, what are we actually asking the dl4j folks to help






How do these recognizers play together?




Thank you.












[1] e.g.