You definitely need to separate into three sets.

Another way to put it is that with cross validation, any learning algorithm

needs to have test data withheld from it. The remaining data is training

data to be used by the learning algorithm.

Some training algorithms such as the one that you describe divide their

training data into portions so that they can learn hyper-parameters

separately from parameters. Whether the learning algorithm does this or

uses some other technique to come to a final value for the model has no

bearing on whether the original test data is withheld and because the test

data has to be unconditionally withheld, any sub-division of the training

data cannot include any of the test data.

In your case, you hold back 25% test data. Then you divide the remaining

75% into 25% validation and 50% training. The validation set has to be

separate from the 50% in order to avoid over-fitting, but the test data has

to be separate from the training+validation for the same reason.

On Tue, Sep 10, 2013 at 4:22 PM, Parimi Rohit <[EMAIL PROTECTED]>wrote:

> Hi All,

>

> I was wondering if there is any experimental design to tune the parameters

> of ALS algorithm in mahout, so that we can compare its recommendations with

> recommendations from another algorithm.

>

> My datasets have implicit data and would like to use the following design

> for tuning the ALS parameters (alphs, lambda, numfeatures).

>

> 1. Split the data such that for each user, 50% of the clicks go to train,

> 25% go to validation, 25% goes to test.

>

> 2. Create the user and item features by applying the ALS algorithm on

> training data, and test on the validation set. (We can pick the parameters

> which minimizes the RMSE score, in-case of implicit data, Pui - XY’)

> 3. Once we find the parameters which give the best RMSE value on

> validation, use the user and item matrices generated for those parameters

> to predict the top k items and test it with the items in the test set

> (compute mean average precision).

>

> Although the above setting looks good, I have few questions

>

> 1. Do we have to follow this setting, to compare algorithms? Can't we

> report the parameter combination for which we get highest mean average

> precision for the test data, when trained on the train set, with out any

> validation set.

> 2. Do we have to tune the "similarityclass" parameter in item-based CF? If

> so, do we compare the mean average precision values based on validation

> data, and then report the same for the test set?

>

> My ultimate objective is to compare different algorithms but I am confused

> as to how to compare the best results (based on parameter tuning) between

> algorithms. Are there any publications that explain this in detail? Any

> help/comments about the design of experiments is much appreciated.

>

> Thanks,

> Rohit

>