A scaling law for the validation set training set ratio.

I. Guyon.
Unpublished Technical Report, AT&T Bell Laboratories.
1996



We address the problem of determining what fraction of the training set should be reserved as development test set or validation set. We determine that the ratio of the validation set size over the training set size scales like the square root of two complexity parameters: the complexity of the second level of inference (minimizing the validation error) over the complexity of the first level of inference (minimizing the error rate on the training set).

Keywords: cross-validation, learning Theory, statistics, machine learning, pattern recognition, training set, validation set, test set, experiment design.





[ next paper ]