Subject: https://issues.apache.org/jira/browse/MAHOUT-662 -- misplaced calls to setJarByClass


So, I think I've hit the bottom of this, and, due to what I'd call
questionably useful design in Hadoop, I'm a bit frustrated.

The ground state of hadoop is not to copy any jars anyplace -- to
assume that whatever you need, you've got out there where you need it.
the 'jar' subcommand of the hadoop command does not trigger any
copying of anything.

If you manage to call setJar on the 'overall job' jar, then the whole
thing will travel around, lib directory and all, carrying dependencies
with it.

If you call setJar on a jar from inside the lib/, or some other jar,
just that jar will travel around.

So, my 'idea #2' no longer appeals to me. Really, if we want people to
be able to play legos with our jobs, we need to give them an API that
allows them to own the JobConf and thus control the jar situation. Our
own example jar is sadly an example of this.

We have a lot of Job classes. So any scheme to allow user control of
this will be a lot of little edits. I'll make a branch in my little
gitiverse and see how far I get.