I am not sure I understand the hadoop svn structure, however I was able to
make it work with hadoop trunk, or 0.20.0-dev.
It didn't work with hadoop/branch-0.18, with or without patch 4277.
Here is a copy-paste of the steps, once Hadoop is built and installed. I am
using the same exact "apache-mahout-examples-0.1-dev.job", not rebuilt with
the 0.20.0-dev jars.
It works!
That would mean that the bug/feature is not related to
HADOOP-4277<
http://issues.apache.org/jira/browse/HADOOP-4277>,
and was reintroduced (or never took away) in hadoop/trunk.
hadoop@phil:/usr/local/hadoop$ bin/hadoop namenode -format
08/10/29 18:27:59 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = phil/127.0.1.1
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 0.20.0-dev
STARTUP_MSG: build = -r ; compiled by 'philippe' on Wed Oct 29 18:25:08
EDT 2008
************************************************************/
08/10/29 18:28:00 INFO namenode.FSNamesystem: fsOwner=hadoop,hadoop
08/10/29 18:28:00 INFO namenode.FSNamesystem: supergroup=supergroup
08/10/29 18:28:00 INFO namenode.FSNamesystem: isPermissionEnabled=true
08/10/29 18:28:00 INFO common.Storage: Image file of size 96 saved in 0
seconds.
08/10/29 18:28:00 INFO common.Storage: Storage directory
/usr/local/hadoop-datastore/hadoop-hadoop/dfs/name has been successfully
formatted.
08/10/29 18:28:00 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at phil/127.0.1.1
************************************************************/
hadoop@phil:/usr/local/hadoop$ bin/hadoop dfs -put
/home/philippe/synthetic_control.data testdata
hadoop@phil:/usr/local/hadoop$ bin/hadoop jar
/home/philippe/workspace/MahoutJava/examples/build/apache-mahout-examples-0.1-dev.job
org.apache.mahout.clustering.syntheticcontrol.kmeans.Job
08/10/29 18:28:45 WARN mapred.JobClient: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the same.
08/10/29 18:28:46 INFO mapred.FileInputFormat: Total input paths to process
: 1
08/10/29 18:28:47 INFO mapred.JobClient: Running job: job_200810291828_0002
08/10/29 18:28:48 INFO mapred.JobClient: map 0% reduce 0%
08/10/29 18:28:54 INFO mapred.JobClient: map 50% reduce 0%
08/10/29 18:28:55 INFO mapred.JobClient: map 100% reduce 0%
08/10/29 18:28:56 INFO mapred.JobClient: Job complete: job_200810291828_0002
08/10/29 18:28:56 INFO mapred.JobClient: Counters: 7
08/10/29 18:28:56 INFO mapred.JobClient: File Systems
08/10/29 18:28:56 INFO mapred.JobClient: HDFS bytes read=291644
08/10/29 18:28:56 INFO mapred.JobClient: HDFS bytes written=323660
08/10/29 18:28:56 INFO mapred.JobClient: Job Counters
08/10/29 18:28:56 INFO mapred.JobClient: Launched map tasks=2
08/10/29 18:28:56 INFO mapred.JobClient: Data-local map tasks=2
08/10/29 18:28:56 INFO mapred.JobClient: Map-Reduce Framework
08/10/29 18:28:56 INFO mapred.JobClient: Map input records=600
08/10/29 18:28:56 INFO mapred.JobClient: Map input bytes=288374
08/10/29 18:28:56 INFO mapred.JobClient: Map output records=600
08/10/29 18:28:56 WARN mapred.JobClient: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the same.
08/10/29 18:28:56 INFO mapred.FileInputFormat: Total input paths to process
: 2
08/10/29 18:28:56 INFO mapred.JobClient: Running job: job_200810291828_0003
08/10/29 18:28:57 INFO mapred.JobClient: map 0% reduce 0%
08/10/29 18:29:03 INFO mapred.JobClient: map 50% reduce 0%
08/10/29 18:29:05 INFO mapred.JobClient: map 100% reduce 0%
08/10/29 18:29:10 INFO mapred.JobClient: map 100% reduce 100%
08/10/29 18:29:11 INFO mapred.JobClient: Job complete: job_200810291828_0003
08/10/29 18:29:11 INFO mapred.JobClient: Counters: 16
08/10/29 18:29:11 INFO mapred.JobClient: File Systems
08/10/29 18:29:11 INFO mapred.JobClient: HDFS bytes read=323660
08/10/29 18:29:11 INFO mapred.JobClient: HDFS bytes written=9657
08/10/29 18:29:11 INFO mapred.JobClient: Local bytes read=36119
08/10/29 18:29:11 INFO mapred.JobClient: Local bytes written=72300
08/10/29 18:29:11 INFO mapred.JobClient: Job Counters
08/10/29 18:29:11 INFO mapred.JobClient: Launched reduce tasks=1
08/10/29 18:29:11 INFO mapred.JobClient: Launched map tasks=2
08/10/29 18:29:11 INFO mapred.JobClient: Data-local map tasks=2
08/10/29 18:29:11 INFO mapred.JobClient: Map-Reduce Framework
08/10/29 18:29:11 INFO mapred.JobClient: Reduce input groups=1
08/10/29 18:29:11 INFO mapred.JobClient: Combine output records=28
08/10/29 18:29:11 INFO mapred.JobClient: Map input records=600
08/10/29 18:29:11 INFO mapred.JobClient: Reduce output records=7
08/10/29 18:29:11 INFO mapred.JobClient: Map output bytes=943020
08/10/29 18:29:11 INFO mapred.JobClient: Map input bytes=323660
08/10/29 18:29:11 INFO mapred.JobClient: Combine input records=1732
08/10/29 18:29:11 INFO mapred.JobClient: Map output records=1732
08/10/29 18:29:11 INFO mapred.JobClient: Reduce input records=28
08/10/29 18:29:11 WARN mapred.JobClient: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the same.
08/10/29 18:29:11 INFO mapred.FileInputFormat: Total input paths to process
08/10/29 18:29:12 INFO mapred.JobClient: Running job: job_200810291828_0004
08/10/29 18:29:13 INFO mapred.JobClient: map 0% reduce 0%
08/10/29 18:29:20 INFO mapred.JobClient: map 50% reduce 0%
08/10/29 18:29:22 INFO mapred.JobClient: map 100% reduce 0%
08/10/29 18:29:27 INFO mapred.JobClient: map 100% reduce 100%
08/10/29 18:29:28 INFO mapred.JobClient: Job complete: job_200810291828_0004
08/10/29 18:29:28 INFO mapred.JobClient: Counters: 16
08/10/29 18:29:28 INFO mapred.JobClient: File Systems
08/10/29 18:29:28 INFO mapred.JobClient: HDFS bytes read=342974
08/10/29 18:29:28 INFO mapred.Jo