Subject: How to find the k most similar docs


I assume that the other matrix operations will consume and produce
<Text, MatrixWritable>? If so how do you create <Text, MatrixWritable>
from the output of rowid <IntWritable, VectorWritable>?

Also while we are at it how do you use vectordump? If you do "bin/mahout
vectordump --help" you get some crazy output that is unreadable. I would
have guessed that vectordump would work on either <IntWritable,
VectorWritable> so the output of rowid OR <Text, VectorWritable> the
contents of tfidf-vectors/part-r-00000 but it doesn't seem to work on
either using "bin/mahout vectordump -s path-to-file"

Thanks
Pat

On 3/9/12 4:26 AM, Suneel Marthi wrote: