In which epoch should i stop the training to avoid overfitting
$begingroup$
I'm working on an age estimation project trying to classify a given face in a predefined age range. For that purpose I'm training a deep NN using the keras library. The accuracy for the training and the validation sets is shown in the graph below:

As you can see the validation accuracy keeps rising with smaller steps than the training accuracy. Should i stop training at the epoch 280 in which the training and the validation accuracy have the same value or should i proceed the training process as long as the validation accuracy is rising, even thought the training accuracy value is also getting at overfitted values (eg. 93%).
machine-learning neural-network deep-learning accuracy overfitting
$endgroup$
add a comment |
$begingroup$
I'm working on an age estimation project trying to classify a given face in a predefined age range. For that purpose I'm training a deep NN using the keras library. The accuracy for the training and the validation sets is shown in the graph below:

As you can see the validation accuracy keeps rising with smaller steps than the training accuracy. Should i stop training at the epoch 280 in which the training and the validation accuracy have the same value or should i proceed the training process as long as the validation accuracy is rising, even thought the training accuracy value is also getting at overfitted values (eg. 93%).
machine-learning neural-network deep-learning accuracy overfitting
$endgroup$
add a comment |
$begingroup$
I'm working on an age estimation project trying to classify a given face in a predefined age range. For that purpose I'm training a deep NN using the keras library. The accuracy for the training and the validation sets is shown in the graph below:

As you can see the validation accuracy keeps rising with smaller steps than the training accuracy. Should i stop training at the epoch 280 in which the training and the validation accuracy have the same value or should i proceed the training process as long as the validation accuracy is rising, even thought the training accuracy value is also getting at overfitted values (eg. 93%).
machine-learning neural-network deep-learning accuracy overfitting
$endgroup$
I'm working on an age estimation project trying to classify a given face in a predefined age range. For that purpose I'm training a deep NN using the keras library. The accuracy for the training and the validation sets is shown in the graph below:

As you can see the validation accuracy keeps rising with smaller steps than the training accuracy. Should i stop training at the epoch 280 in which the training and the validation accuracy have the same value or should i proceed the training process as long as the validation accuracy is rising, even thought the training accuracy value is also getting at overfitted values (eg. 93%).
machine-learning neural-network deep-learning accuracy overfitting
machine-learning neural-network deep-learning accuracy overfitting
edited Jun 5 '18 at 10:44
David Masip
2,6211529
2,6211529
asked May 29 '18 at 9:33
Yiannis AthYiannis Ath
3816
3816
add a comment |
add a comment |
5 Answers
5
active
oldest
votes
$begingroup$
As long as your validation accuracy increases, you should keep training. I would stop when the test accuracy starts decreasing (this is known as early stopping). The general advise is always to keep the model that performs the best in your validation set.
Although it is right that your model overfits a little since epoch 280, it is not necessarily a bad thing provided that your validation accuracy is high. In general, most machine learning models will have higher training accuracy compared to validation accuracy, but this doesn't have to be bad.
In addition to the accuracy, Another thing to take into account is the loss in the test set. This tutorial explains the overfitting and underfitting very well https://www.tensorflow.org/tutorials/keras/overfit_and_underfit
$endgroup$
add a comment |
$begingroup$
"Early Stopping" is the concept which needs to be used here.
As mentioned in wikipedia about early stopping,
In machine learning, early stopping is a form of regularization used to avoid overfitting when training a learner with an iterative method, such as gradient descent. Such methods update the learner so as to make it better fit the training data with each iteration. Up to a point, this improves the learner's performance on data outside of the training set. Past that point, however, improving the learner's fit to the training data comes at the expense of increased generalization error. Early stopping rules provide guidance as to how many iterations can be run before the learner begins to over-fit. Early stopping rules have been employed in many different machine learning methods, with varying amounts of theoretical foundation.
At epoch > 280 in your graph, validation accuracy becomes lesser than training accuracy and hence it becomes a case of overfitting. In order to avoid overfitting here, training further is not recommended. However you may choose to train the model beyond the epoch where training and validation accuracy matches if the resulting validation accuracy is sufficient for the particular problem you are working on.
$endgroup$
add a comment |
$begingroup$
Keep training until your validation accuracy saturates (or starts dropping). Since the accuracy increases slowly, try to increase your learning rate parameter etato force the network to converge faster to the optimum weights. Be aware though, if you increase it too much though, it will become unstable.
$endgroup$
add a comment |
$begingroup$
You should also look for training error Vs testing error than training accuracy and testing accuracy.
$endgroup$
1
$begingroup$
isn't it the same?
$endgroup$
– Francesco Pegoraro
Oct 6 '18 at 11:24
add a comment |
$begingroup$
If you're using keras or tensorflow.keras, this parameter is known as patience in the EarlyStopping callback.
It equals the number of epochs with no validation accuracy improvement to trigger the end of the training phase. I usually set it to 2 or 3, 1 is usually too sensitive to noise.
$endgroup$
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "557"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f32306%2fin-which-epoch-should-i-stop-the-training-to-avoid-overfitting%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
5 Answers
5
active
oldest
votes
5 Answers
5
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
As long as your validation accuracy increases, you should keep training. I would stop when the test accuracy starts decreasing (this is known as early stopping). The general advise is always to keep the model that performs the best in your validation set.
Although it is right that your model overfits a little since epoch 280, it is not necessarily a bad thing provided that your validation accuracy is high. In general, most machine learning models will have higher training accuracy compared to validation accuracy, but this doesn't have to be bad.
In addition to the accuracy, Another thing to take into account is the loss in the test set. This tutorial explains the overfitting and underfitting very well https://www.tensorflow.org/tutorials/keras/overfit_and_underfit
$endgroup$
add a comment |
$begingroup$
As long as your validation accuracy increases, you should keep training. I would stop when the test accuracy starts decreasing (this is known as early stopping). The general advise is always to keep the model that performs the best in your validation set.
Although it is right that your model overfits a little since epoch 280, it is not necessarily a bad thing provided that your validation accuracy is high. In general, most machine learning models will have higher training accuracy compared to validation accuracy, but this doesn't have to be bad.
In addition to the accuracy, Another thing to take into account is the loss in the test set. This tutorial explains the overfitting and underfitting very well https://www.tensorflow.org/tutorials/keras/overfit_and_underfit
$endgroup$
add a comment |
$begingroup$
As long as your validation accuracy increases, you should keep training. I would stop when the test accuracy starts decreasing (this is known as early stopping). The general advise is always to keep the model that performs the best in your validation set.
Although it is right that your model overfits a little since epoch 280, it is not necessarily a bad thing provided that your validation accuracy is high. In general, most machine learning models will have higher training accuracy compared to validation accuracy, but this doesn't have to be bad.
In addition to the accuracy, Another thing to take into account is the loss in the test set. This tutorial explains the overfitting and underfitting very well https://www.tensorflow.org/tutorials/keras/overfit_and_underfit
$endgroup$
As long as your validation accuracy increases, you should keep training. I would stop when the test accuracy starts decreasing (this is known as early stopping). The general advise is always to keep the model that performs the best in your validation set.
Although it is right that your model overfits a little since epoch 280, it is not necessarily a bad thing provided that your validation accuracy is high. In general, most machine learning models will have higher training accuracy compared to validation accuracy, but this doesn't have to be bad.
In addition to the accuracy, Another thing to take into account is the loss in the test set. This tutorial explains the overfitting and underfitting very well https://www.tensorflow.org/tutorials/keras/overfit_and_underfit
edited 14 hours ago
jperezmartin
31
31
answered May 29 '18 at 12:23
David MasipDavid Masip
2,6211529
2,6211529
add a comment |
add a comment |
$begingroup$
"Early Stopping" is the concept which needs to be used here.
As mentioned in wikipedia about early stopping,
In machine learning, early stopping is a form of regularization used to avoid overfitting when training a learner with an iterative method, such as gradient descent. Such methods update the learner so as to make it better fit the training data with each iteration. Up to a point, this improves the learner's performance on data outside of the training set. Past that point, however, improving the learner's fit to the training data comes at the expense of increased generalization error. Early stopping rules provide guidance as to how many iterations can be run before the learner begins to over-fit. Early stopping rules have been employed in many different machine learning methods, with varying amounts of theoretical foundation.
At epoch > 280 in your graph, validation accuracy becomes lesser than training accuracy and hence it becomes a case of overfitting. In order to avoid overfitting here, training further is not recommended. However you may choose to train the model beyond the epoch where training and validation accuracy matches if the resulting validation accuracy is sufficient for the particular problem you are working on.
$endgroup$
add a comment |
$begingroup$
"Early Stopping" is the concept which needs to be used here.
As mentioned in wikipedia about early stopping,
In machine learning, early stopping is a form of regularization used to avoid overfitting when training a learner with an iterative method, such as gradient descent. Such methods update the learner so as to make it better fit the training data with each iteration. Up to a point, this improves the learner's performance on data outside of the training set. Past that point, however, improving the learner's fit to the training data comes at the expense of increased generalization error. Early stopping rules provide guidance as to how many iterations can be run before the learner begins to over-fit. Early stopping rules have been employed in many different machine learning methods, with varying amounts of theoretical foundation.
At epoch > 280 in your graph, validation accuracy becomes lesser than training accuracy and hence it becomes a case of overfitting. In order to avoid overfitting here, training further is not recommended. However you may choose to train the model beyond the epoch where training and validation accuracy matches if the resulting validation accuracy is sufficient for the particular problem you are working on.
$endgroup$
add a comment |
$begingroup$
"Early Stopping" is the concept which needs to be used here.
As mentioned in wikipedia about early stopping,
In machine learning, early stopping is a form of regularization used to avoid overfitting when training a learner with an iterative method, such as gradient descent. Such methods update the learner so as to make it better fit the training data with each iteration. Up to a point, this improves the learner's performance on data outside of the training set. Past that point, however, improving the learner's fit to the training data comes at the expense of increased generalization error. Early stopping rules provide guidance as to how many iterations can be run before the learner begins to over-fit. Early stopping rules have been employed in many different machine learning methods, with varying amounts of theoretical foundation.
At epoch > 280 in your graph, validation accuracy becomes lesser than training accuracy and hence it becomes a case of overfitting. In order to avoid overfitting here, training further is not recommended. However you may choose to train the model beyond the epoch where training and validation accuracy matches if the resulting validation accuracy is sufficient for the particular problem you are working on.
$endgroup$
"Early Stopping" is the concept which needs to be used here.
As mentioned in wikipedia about early stopping,
In machine learning, early stopping is a form of regularization used to avoid overfitting when training a learner with an iterative method, such as gradient descent. Such methods update the learner so as to make it better fit the training data with each iteration. Up to a point, this improves the learner's performance on data outside of the training set. Past that point, however, improving the learner's fit to the training data comes at the expense of increased generalization error. Early stopping rules provide guidance as to how many iterations can be run before the learner begins to over-fit. Early stopping rules have been employed in many different machine learning methods, with varying amounts of theoretical foundation.
At epoch > 280 in your graph, validation accuracy becomes lesser than training accuracy and hence it becomes a case of overfitting. In order to avoid overfitting here, training further is not recommended. However you may choose to train the model beyond the epoch where training and validation accuracy matches if the resulting validation accuracy is sufficient for the particular problem you are working on.
answered May 30 '18 at 6:37
Divyanshu ShekharDivyanshu Shekhar
184113
184113
add a comment |
add a comment |
$begingroup$
Keep training until your validation accuracy saturates (or starts dropping). Since the accuracy increases slowly, try to increase your learning rate parameter etato force the network to converge faster to the optimum weights. Be aware though, if you increase it too much though, it will become unstable.
$endgroup$
add a comment |
$begingroup$
Keep training until your validation accuracy saturates (or starts dropping). Since the accuracy increases slowly, try to increase your learning rate parameter etato force the network to converge faster to the optimum weights. Be aware though, if you increase it too much though, it will become unstable.
$endgroup$
add a comment |
$begingroup$
Keep training until your validation accuracy saturates (or starts dropping). Since the accuracy increases slowly, try to increase your learning rate parameter etato force the network to converge faster to the optimum weights. Be aware though, if you increase it too much though, it will become unstable.
$endgroup$
Keep training until your validation accuracy saturates (or starts dropping). Since the accuracy increases slowly, try to increase your learning rate parameter etato force the network to converge faster to the optimum weights. Be aware though, if you increase it too much though, it will become unstable.
answered May 29 '18 at 12:28
pcko1pcko1
1,681418
1,681418
add a comment |
add a comment |
$begingroup$
You should also look for training error Vs testing error than training accuracy and testing accuracy.
$endgroup$
1
$begingroup$
isn't it the same?
$endgroup$
– Francesco Pegoraro
Oct 6 '18 at 11:24
add a comment |
$begingroup$
You should also look for training error Vs testing error than training accuracy and testing accuracy.
$endgroup$
1
$begingroup$
isn't it the same?
$endgroup$
– Francesco Pegoraro
Oct 6 '18 at 11:24
add a comment |
$begingroup$
You should also look for training error Vs testing error than training accuracy and testing accuracy.
$endgroup$
You should also look for training error Vs testing error than training accuracy and testing accuracy.
answered Oct 6 '18 at 11:01
Ashok Kumar JayaramanAshok Kumar Jayaraman
1212
1212
1
$begingroup$
isn't it the same?
$endgroup$
– Francesco Pegoraro
Oct 6 '18 at 11:24
add a comment |
1
$begingroup$
isn't it the same?
$endgroup$
– Francesco Pegoraro
Oct 6 '18 at 11:24
1
1
$begingroup$
isn't it the same?
$endgroup$
– Francesco Pegoraro
Oct 6 '18 at 11:24
$begingroup$
isn't it the same?
$endgroup$
– Francesco Pegoraro
Oct 6 '18 at 11:24
add a comment |
$begingroup$
If you're using keras or tensorflow.keras, this parameter is known as patience in the EarlyStopping callback.
It equals the number of epochs with no validation accuracy improvement to trigger the end of the training phase. I usually set it to 2 or 3, 1 is usually too sensitive to noise.
$endgroup$
add a comment |
$begingroup$
If you're using keras or tensorflow.keras, this parameter is known as patience in the EarlyStopping callback.
It equals the number of epochs with no validation accuracy improvement to trigger the end of the training phase. I usually set it to 2 or 3, 1 is usually too sensitive to noise.
$endgroup$
add a comment |
$begingroup$
If you're using keras or tensorflow.keras, this parameter is known as patience in the EarlyStopping callback.
It equals the number of epochs with no validation accuracy improvement to trigger the end of the training phase. I usually set it to 2 or 3, 1 is usually too sensitive to noise.
$endgroup$
If you're using keras or tensorflow.keras, this parameter is known as patience in the EarlyStopping callback.
It equals the number of epochs with no validation accuracy improvement to trigger the end of the training phase. I usually set it to 2 or 3, 1 is usually too sensitive to noise.
answered 8 hours ago
Learning is a messLearning is a mess
234211
234211
add a comment |
add a comment |
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f32306%2fin-which-epoch-should-i-stop-the-training-to-avoid-overfitting%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown