Hi David,
Answers inline

On Thu, Feb 8, 2018 at 9:19 AM, <[EMAIL PROTECTED]> wrote:

Excellent.

No, you just add 'any23' to the list of plugins within the plugin.includes
property of nutch-site.xml

OK, so current configuration for the Any23 plugin, is to store extracted
structured data markup in the Nutch Metadata object with a key "
Any23-Triples". You can locate it using something like the ParserChekcer
tool provided via the 'nutch' script. Liekwise you can also locate it, as a
representation of what would be indexed, by using the IndexerChecker
tooling also provided within the 'nutch' script.

An example would be as follows, data is now indexed as follows (example
after crawling https://smartive.ch/jobs):
          "structured_data": [
            {
              "node": "<https://smartive.ch/jobs>",
              "value": "\"IE-edge,chrome=1\"@de",
              "key": "<http://vocab.sindice.net/any23#X-UA-Compatible>",
              "short_key": "X-UA-Compatible"
            },
            {
              "node": "<https://smartive.ch/jobs>",
              "value": "\"Wir sind smartive \\u2014 eine dynamische,
innovative Schweizer Webentwicklungsagentur. Die Realisierung
zeitgem\\u00E4sser Webl\\u00F6sungen geh\\u00F6rt genauso zu unserer
Passion, wie die konstruktive Zusammenarbeit mit unseren Kundinnen und
Kunden.\"@de",
              "key": "<http://vocab.sindice.net/any23#description>",
              "short_key": "description"
            },
            {
              "node": "<https://smartive.ch/jobs>",
              "value": "\"width=device-width, initial-scale=1,
shrink-to-fit=no\"@de",
              "key": "<http://vocab.sindice.net/any23#viewport>",
              "short_key": "viewport"
            },
            {
              "node": "<https://smartive.ch/jobs>",
              "value": "\"width=device-width,initial-scale=1\"@de",
              "key": "<http://vocab.sindice.net/any23#viewport>",
              "short_key": "viewport"
            },
            {
              "node": "<https://smartive.ch/jobs>",
              "value": "\"ie=edge\"@de",
              "key": "<http://vocab.sindice.net/any23#x-ua-compatible>",
              "short_key": "x-ua-compatible"
            }
          ],
Note from above, that the 'predicate' key field is very useful for quickly
filtering through, for example, Hotel Ratings, or something similar.
See the tooling for ParserChecker and IndexerChecker as explained above.
Any further question, please let me know.
Lewis