ManifoldCF's memory consumption is bounded but scales by the number of
worker threads you allow. If you have 100 worker threads and each doc can
consume 50mb then you need to have at least 5gb right there for Solr
output. Tika is also quite expensive memory-wise so I'd allocate at least
10gb for ManifoldCF to support the pipeline you have set up.
The best way to control memory, therefore, is probably to reduce the number
of worker threads.
(I assume you are using the combined war here, otherwise Tomcat would not
On Thu, Jan 18, 2018 at 6:44 AM, Shashank Raj <[EMAIL PROTECTED]>