You definitely need to separate into three sets.
Another way to put it is that with cross validation, any learning algorithm
needs to have test data withheld from it. The remaining data is training
data to be used by the learning algorithm.
Some training algorithms such as the one that you describe divide their
training data into portions so that they can learn hyper-parameters
separately from parameters. Whether the learning algorithm does this or
uses some other technique to come to a final value for the model has no
bearing on whether the original test data is withheld and because the test
data has to be unconditionally withheld, any sub-division of the training
data cannot include any of the test data.
In your case, you hold back 25% test data. Then you divide the remaining
75% into 25% validation and 50% training. The validation set has to be
separate from the 50% in order to avoid over-fitting, but the test data has
to be separate from the training+validation for the same reason.
On Tue, Sep 10, 2013 at 4:22 PM, Parimi Rohit <[EMAIL PROTECTED]>wrote: