Tastify 
Mahout  [mail # user]

...Karl, Do you have raw play events, or do you have progress events as well? The single biggest improvement you can make with this kind of system is to quantify engagement somehow.... 




Tastify  Mahout  [mail # user]

...Hierarchical modeling techniques work well on structures like this if you have good resolution of your metadata. Resolving and disambiguating artist and track names can be difficult u... 





Algorithms in Mahout 
Mahout  [mail # user]

...On Mon, Nov 25, 2013 at 3:14 AM, Manuel Blechschmidt wrote: > There are/were multiple kNN implementation in Mahout: > Recommender knn > http://grepcode.com/file/repo1.ma... 




Any utility to solve the matrix inversion in Map/Reduce Way 
Mahout  [mail # user]

...Left unsaid in this comment is the fact that matrix inversion of any sizable matrix is almost always a mistake because it is (a) inaccurate, (b) slow. In scalable numerics it is also c... 




Regarding support of Evolutionary Algorithms on Apache Mahout 
Mahout  [mail # user]

...It still has recorded step meta mutation. This should be very good for continuous domains. It does not have a general genetic algorithm. Sent from my iPhone On Jan 1... 




Parallel ALSWR on very large matrix  crashing (I think) 
Mahout  [mail # user]

...So the total size of the data is modest at about 560 M nonzero elements. Total data should be small compared to your node sizes. But the distribution of your data can be importa... 




Need help in loading model for classification 
Mahout  [mail # user]

...Trim the model by setting a minimum term frequency. On Thu, Feb 2, 2012 at 9:39 PM, SAMIK CHAKRABORTY wrote: > Hi, > > I am new to mahout and hadoop. > > I h... 




A little programming challenge 
Mahout  [mail # user]

...Mahout has an implementation as well. On Sunday, February 20, 2011, Dawid Weiss wrote: > Ok, I get it. Ted's suggestion is probably something to follow  > instead of seed... 




A little programming challenge  Mahout  [mail # user]

...This is why we normally use prng's based on murmur hash for building deterministic random vectors. Smutty and Jake can probably get you a specific pointer before I get back to a real compute... 





Mahout: NB Model for Text Classification  In Sample Error 
Mahout  [mail # user]

...Yes for active learning, no for transduction. You can do semisupervised clustering as well. This is a special case of transduction generally in many ways. If you mean clas... 




Clustering Question (from a newbie) 
Mahout  [mail # user]

...Do the category values add up to 1 for every row? Where do these percentages come from? At 148 rows, I would use R instead of Mahout. On Tue, Nov 22, 2011 at 2:42 AM, Ferna... 




Clustering Question (from a newbie)  Mahout  [mail # user]

...I would recommend that you work with the original counts instead of percentages. That allows you to use statistical similarity measures based on the multinomial distribution. The... 



Clustering Question (from a newbie)  Mahout  [mail # user]

...Make sure that you add it once on the top and N times on the bottom of the expression (i.e. once for each category). On Wed, Nov 23, 2011 at 12:44 AM, Fernando O. wrote: &g... 





Loglikelihood ratio test as a probability 
Mahout  [mail # user]

...I think that this is a really bad thing to do. The LLR is really good to find interesting things. Once you have done that, directly using the LLR in any form to produce a weight ... 




Loglikelihood ratio test as a probability  Mahout  [mail # user]

...On Fri, Jun 21, 2013 at 8:25 AM, Dan Filimon wrote: > Thanks for the reference! I'll take a look at chapter 7, but let me first > describe what I'm trying to achieve. > > I... 



Loglikelihood ratio test as a probability  Mahout  [mail # user]

...Well, you are still stuck with the problem that pulling more bits out of the small count data is a bad idea. Most of the models that I am partial to never even honestly estimate probab... 



Loglikelihood ratio test as a probability  Mahout  [mail # user]

...On Fri, Jun 21, 2013 at 10:59 AM, Dan Filimon wrote: > Could you be more explicit? > What models are these, how do I use them to track how similar two items > are? > ... 



Loglikelihood ratio test as a probability  Mahout  [mail # user]

...See https://github.com/tdunning/inmemorycooccurrence for an inmemory implementation. Should just require three or so lines of code. On Fri, Jun 21, 2013 at 11:23 AM, Se... 





