Model Cross Validation

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

WHAT ARE THE DIFFERENT METHODS OF

MODEL CROSS-
VALIDATION?

www.benchmarksixsigma.com
MEANING

Model Cross Validation is a term


used in Machine Learning to
estimate the model's predictive
accuracy when fed with real-life
data.

www.benchmarksixsigma.com
COMMON METHODS OF
VALIDATION TECHNIQUES:

1) The Validation Set Approach


(Data Split): In this approach, the
data is randomly split into two
sets. One set, 50% of the dataset,
is used to train the model, and the
remaining 50% is used to test the
model.

www.benchmarksixsigma.com
2) Leave one out Cross Validation
(LOOCV): In this approach, only
one data point from the dataset is
reserved, and the rest of the data is
used to train the model to record
the test error associated with the
prediction. This process is repeated
for every data point.

www.benchmarksixsigma.com
3) k-fold Cross Validation: The cons
of the previous two approaches are
addressed in this approach. Here the
model performance on a different
subset of the training data is
evaluated, and the average
prediction error rate is calculated.

www.benchmarksixsigma.com
4) Repeated k-fold Cross
Validation: In this approach, the
process of splitting the data into k
folds can be repeated n times, thus
resulting in n random partitions of
the original sample. The results are
then averaged to come up with a
single estimation.

www.benchmarksixsigma.com
5)Stratified k-fold Cross
Validation: In this approach, the
data is rearranged to ensure that
each fold is a good representation
of the whole dataset.

www.benchmarksixsigma.com
6) Adversarial Validation: This
approach generally checks the
similarity between the test
/validation and train sets
regarding feature distribution. If
there is not much similarity, then
we can suspect that the datasets
are quite different.

www.benchmarksixsigma.com
7) Cross Validation for Time Series:
A time series dataset cannot be
randomly split because the time
section of the data will be messed
up. For a time series forecasting
problem, folds for time series cross-
validation are created in a forward
chaining fashion.

www.benchmarksixsigma.com
8) Custom Cross-Validation
Techniques: No single method works
best for all problem statements.
Hence a custom cross-validation
technique can be created based on
a feature or combination of features
that will give the user stable cross-
validation scores.

www.benchmarksixsigma.com
Learn about Model building in our
Business Modelling Expert
program.

+91 98113 70943

www.benchmarksixsigma.com

https://tinyurl.com/BSSinquire

You might also like