Awesome on Any23 2.2 forthcoming release. I look forward to it and subsequent bump to Nutch.
In the meantime, I was successful to build Any23 from master, then copy the any23 jars into Nutch (master) then reference them in the plugin…
Unfortunately when I reran the nutch parsechecker it failed to parse anymore. A quick look at the logs/hadoop.log reveal that updated any23 depends on new classes in the other jar files:
Caused by: java.lang.NoClassDefFoundError: org/apache/commons/rdf/api/IRI
Caused by: java.lang.NoClassDefFoundError: Could not initialize class org.semanticweb.owlapi.rio.OWLAPIRDFFormat
java.util.concurrent.ExecutionException: java.lang.NoSuchMethodError: org.jsoup.select.NodeTraversor.traverse(Lorg/jsoup/select/NodeVisitor;Lorg/jsoup/nodes/Node;)V
I guess I would need to rebuild nutch from master (rather than just copy a few jar files) and ensure that any23’s jar dependencies as also references..