clear query| facets| time Search criteria: .   Results from 1 to 10 from 2926 (0.0s).
Loading phrases to help you
refine your search...
[NUTCH-2730] SitemapProcessor to treat sitemap URLs as Set instead of List - Nutch - [issue]
...https://archive.epa.gov/robots.txt lists 160k sitemap URLs, absurd! Almost 160k of them are duplicates, no friendly words to describe this astonishing fact.And although our Nutch locally che...
http://issues.apache.org/jira/browse/NUTCH-2730    Author: Markus Jelsma , 2020-07-14, 12:00
[ANNOUNCE] Apache Nutch 1.17 Release - Nutch - [mail # user]
...Thanks Sebastian!  -----Original message-----> From:Sebastian Nagel > Sent: Thursday 2nd July 2020 16:42> To: [EMAIL PROTECTED]> Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED...
   Author: Markus Jelsma , 2020-07-02, 16:08
[VOTE] Release Apache Nutch 1.17 RC#1 - Nutch - [mail # dev]
...Hello,+1 from me too!Thanks,Markus -----Original message-----> From:Furkan KAMACI > Sent: Saturday 20th June 2020 18:15> To: [EMAIL PROTECTED]> Subject: Re: [VOTE] Release A...
   Author: Markus Jelsma , 2020-06-30, 09:47
[NUTCH-2794] Add additional ciphers to HTTP base's default cipher suite - Nutch - [issue]
...More sites switch to stronger cipher suites to support and lib-http stays behind./This ticket adds some cipher suites and enables protocol-http to crawl affected sites....
http://issues.apache.org/jira/browse/NUTCH-2794    Author: Markus Jelsma , 2020-06-17, 16:13
[NUTCH-1186] FreeGenerator always normalizes - Nutch - [issue]
...The FreeGenerator does not honor the -normalize option, it always normalizes all URL's in the input directory. The -filter option is respected....
http://issues.apache.org/jira/browse/NUTCH-1186    Author: Markus Jelsma , 2020-06-17, 09:35
[NUTCH-2710] Normalize outlinks before checking for internal or external links - Nutch - [issue]
...We have a normalizer that transforms external URLs back to internal URLs. But those URLs are never passed to the normalizer, because they have already been filtered out by internal and/or ex...
http://issues.apache.org/jira/browse/NUTCH-2710    Author: Markus Jelsma , 2020-06-17, 09:35
eDismax query syntax question - Solr - [mail # user]
...Hello,These are special characters, if you don't need them, you must escape them.See top of the article:https://lucene.apache.org/solr/guide/8_5/the-extended-dismax-query-parser.htmlMarkus&n...
   Author: Markus Jelsma , 2020-06-13, 09:57
[PROPOSAL] Replace whitelist blacklist with allowlist denylist - Nutch - [mail # dev]
...Hello Lewis,I understand the proposal. As an engineer, however, i have some points i would like to address:* The proposed change is not backward compatible, which weighs heavy because it is ...
   Author: Markus Jelsma , 2020-06-10, 10:06
[expand - 1 more] - Building a web based search engine - Solr - [mail # user]
...Hello, see inline.Markus  -----Original message-----> From:Jim Anderson > Sent: Tuesday 2nd June 2020 19:59> To: [EMAIL PROTECTED]> Subject: Re: Building a web based search ...
   Author: Markus Jelsma , 2020-06-02, 18:36
[SOLR-8673] o.a.s.search.facet classes not public/extendable - Solr - [issue]
...It is not easy to create a custom JSON facet function. A simple function based on AvgAgg quickly results in the following compilation failures:[ERROR] Failed to execute goal org.apache.maven...
http://issues.apache.org/jira/browse/SOLR-8673    Author: Markus Jelsma , 2020-05-29, 13:38