[NUTCH-1749] Optionally exclude title from content field - Nutch - [issue]
...The HTML parser plugin inserts document title into document content. Since the title alone can be retrieved via DOMContentUtils.getTitle() and content is retrieved via DOMContentUtils.getTex...    Author: Greg Padiasek , 2018-07-02, 14:49
[NUTCH-1746] OutOfMemoryError in Mappers - Nutch - [issue]
...Initially I found that Generator was throwing OutOfMemoryError exception no matter how much RAM I allocated to JVM. I fixed the problem by moving URLFilters, URLNormalizers and ScoringFilter...    Author: Greg Padiasek , 2017-10-19, 21:21
[NUTCH-1790] solrdedup causes OutOfMemoryError in Solr - Nutch - [issue]
...Nutch 1.7 and 2.2.1 use Hadoop 1.2. In this version Hadoop overwrites "" variable set in mapred-site.xml and in local mode always sets it to 1. As a result Nutch creates a qu...    Author: Greg Padiasek , 2014-05-31, 18:26