On May 13, 2012, at 4:48am, Jacob Metcalf wrote:
If your reducer is running, then Hadoop must have distributed your job jar.
In that case, any class that's actually in your job jar (in the proper position) will be distributed and on the classpath.
Sometimes the problem is that you've got a dependent jar, which then needs to be in the "lib" subdirectory inside of your job jar. Are you maybe building your Avro generated classes into a separate jar, and then adding that to the job jar?
Finally, running under Cygwin is…challenging. I teach a Hadoop class, and often the hardest part of the lab is getting everybody's Cygwin installation working with Hadoop. The fact that you've got pseudo-distributed mode working on Cygwin is impressive in itself, but I would suggest trying your job on a real cluster, e.g. use Elastic MapReduce.
Did you ensure that it's inside of the /lib subdirectory? What does your job jar look like (via "jar tvf <path to job jar>")?
custom big data solutions & training
Hadoop, Cascading, Mahout & Solr