In which epoch should i stop the training to avoid overfitting












5












$begingroup$


I'm working on an age estimation project trying to classify a given face in a predefined age range. For that purpose I'm training a deep NN using the keras library. The accuracy for the training and the validation sets is shown in the graph below:



picture



As you can see the validation accuracy keeps rising with smaller steps than the training accuracy. Should i stop training at the epoch 280 in which the training and the validation accuracy have the same value or should i proceed the training process as long as the validation accuracy is rising, even thought the training accuracy value is also getting at overfitted values (eg. 93%).










share|improve this question











$endgroup$

















    5












    $begingroup$


    I'm working on an age estimation project trying to classify a given face in a predefined age range. For that purpose I'm training a deep NN using the keras library. The accuracy for the training and the validation sets is shown in the graph below:



    picture



    As you can see the validation accuracy keeps rising with smaller steps than the training accuracy. Should i stop training at the epoch 280 in which the training and the validation accuracy have the same value or should i proceed the training process as long as the validation accuracy is rising, even thought the training accuracy value is also getting at overfitted values (eg. 93%).










    share|improve this question











    $endgroup$















      5












      5








      5


      2



      $begingroup$


      I'm working on an age estimation project trying to classify a given face in a predefined age range. For that purpose I'm training a deep NN using the keras library. The accuracy for the training and the validation sets is shown in the graph below:



      picture



      As you can see the validation accuracy keeps rising with smaller steps than the training accuracy. Should i stop training at the epoch 280 in which the training and the validation accuracy have the same value or should i proceed the training process as long as the validation accuracy is rising, even thought the training accuracy value is also getting at overfitted values (eg. 93%).










      share|improve this question











      $endgroup$




      I'm working on an age estimation project trying to classify a given face in a predefined age range. For that purpose I'm training a deep NN using the keras library. The accuracy for the training and the validation sets is shown in the graph below:



      picture



      As you can see the validation accuracy keeps rising with smaller steps than the training accuracy. Should i stop training at the epoch 280 in which the training and the validation accuracy have the same value or should i proceed the training process as long as the validation accuracy is rising, even thought the training accuracy value is also getting at overfitted values (eg. 93%).







      machine-learning neural-network deep-learning accuracy overfitting






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Jun 5 '18 at 10:44









      David Masip

      2,6211529




      2,6211529










      asked May 29 '18 at 9:33









      Yiannis AthYiannis Ath

      3816




      3816






















          5 Answers
          5






          active

          oldest

          votes


















          11












          $begingroup$

          As long as your validation accuracy increases, you should keep training. I would stop when the test accuracy starts decreasing (this is known as early stopping). The general advise is always to keep the model that performs the best in your validation set.



          Although it is right that your model overfits a little since epoch 280, it is not necessarily a bad thing provided that your validation accuracy is high. In general, most machine learning models will have higher training accuracy compared to validation accuracy, but this doesn't have to be bad.



          In addition to the accuracy, Another thing to take into account is the loss in the test set. This tutorial explains the overfitting and underfitting very well https://www.tensorflow.org/tutorials/keras/overfit_and_underfit






          share|improve this answer











          $endgroup$





















            4












            $begingroup$

            "Early Stopping" is the concept which needs to be used here.
            As mentioned in wikipedia about early stopping,




            In machine learning, early stopping is a form of regularization used to avoid overfitting when training a learner with an iterative method, such as gradient descent. Such methods update the learner so as to make it better fit the training data with each iteration. Up to a point, this improves the learner's performance on data outside of the training set. Past that point, however, improving the learner's fit to the training data comes at the expense of increased generalization error. Early stopping rules provide guidance as to how many iterations can be run before the learner begins to over-fit. Early stopping rules have been employed in many different machine learning methods, with varying amounts of theoretical foundation.




            At epoch > 280 in your graph, validation accuracy becomes lesser than training accuracy and hence it becomes a case of overfitting. In order to avoid overfitting here, training further is not recommended. However you may choose to train the model beyond the epoch where training and validation accuracy matches if the resulting validation accuracy is sufficient for the particular problem you are working on.






            share|improve this answer









            $endgroup$





















              3












              $begingroup$

              Keep training until your validation accuracy saturates (or starts dropping). Since the accuracy increases slowly, try to increase your learning rate parameter etato force the network to converge faster to the optimum weights. Be aware though, if you increase it too much though, it will become unstable.






              share|improve this answer









              $endgroup$





















                2












                $begingroup$

                You should also look for training error Vs testing error than training accuracy and testing accuracy.






                share|improve this answer









                $endgroup$









                • 1




                  $begingroup$
                  isn't it the same?
                  $endgroup$
                  – Francesco Pegoraro
                  Oct 6 '18 at 11:24



















                0












                $begingroup$

                If you're using keras or tensorflow.keras, this parameter is known as patience in the EarlyStopping callback.



                It equals the number of epochs with no validation accuracy improvement to trigger the end of the training phase. I usually set it to 2 or 3, 1 is usually too sensitive to noise.






                share|improve this answer









                $endgroup$














                  Your Answer








                  StackExchange.ready(function() {
                  var channelOptions = {
                  tags: "".split(" "),
                  id: "557"
                  };
                  initTagRenderer("".split(" "), "".split(" "), channelOptions);

                  StackExchange.using("externalEditor", function() {
                  // Have to fire editor after snippets, if snippets enabled
                  if (StackExchange.settings.snippets.snippetsEnabled) {
                  StackExchange.using("snippets", function() {
                  createEditor();
                  });
                  }
                  else {
                  createEditor();
                  }
                  });

                  function createEditor() {
                  StackExchange.prepareEditor({
                  heartbeatType: 'answer',
                  autoActivateHeartbeat: false,
                  convertImagesToLinks: false,
                  noModals: true,
                  showLowRepImageUploadWarning: true,
                  reputationToPostImages: null,
                  bindNavPrevention: true,
                  postfix: "",
                  imageUploader: {
                  brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
                  contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
                  allowUrls: true
                  },
                  onDemand: true,
                  discardSelector: ".discard-answer"
                  ,immediatelyShowMarkdownHelp:true
                  });


                  }
                  });














                  draft saved

                  draft discarded


















                  StackExchange.ready(
                  function () {
                  StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f32306%2fin-which-epoch-should-i-stop-the-training-to-avoid-overfitting%23new-answer', 'question_page');
                  }
                  );

                  Post as a guest















                  Required, but never shown

























                  5 Answers
                  5






                  active

                  oldest

                  votes








                  5 Answers
                  5






                  active

                  oldest

                  votes









                  active

                  oldest

                  votes






                  active

                  oldest

                  votes









                  11












                  $begingroup$

                  As long as your validation accuracy increases, you should keep training. I would stop when the test accuracy starts decreasing (this is known as early stopping). The general advise is always to keep the model that performs the best in your validation set.



                  Although it is right that your model overfits a little since epoch 280, it is not necessarily a bad thing provided that your validation accuracy is high. In general, most machine learning models will have higher training accuracy compared to validation accuracy, but this doesn't have to be bad.



                  In addition to the accuracy, Another thing to take into account is the loss in the test set. This tutorial explains the overfitting and underfitting very well https://www.tensorflow.org/tutorials/keras/overfit_and_underfit






                  share|improve this answer











                  $endgroup$


















                    11












                    $begingroup$

                    As long as your validation accuracy increases, you should keep training. I would stop when the test accuracy starts decreasing (this is known as early stopping). The general advise is always to keep the model that performs the best in your validation set.



                    Although it is right that your model overfits a little since epoch 280, it is not necessarily a bad thing provided that your validation accuracy is high. In general, most machine learning models will have higher training accuracy compared to validation accuracy, but this doesn't have to be bad.



                    In addition to the accuracy, Another thing to take into account is the loss in the test set. This tutorial explains the overfitting and underfitting very well https://www.tensorflow.org/tutorials/keras/overfit_and_underfit






                    share|improve this answer











                    $endgroup$
















                      11












                      11








                      11





                      $begingroup$

                      As long as your validation accuracy increases, you should keep training. I would stop when the test accuracy starts decreasing (this is known as early stopping). The general advise is always to keep the model that performs the best in your validation set.



                      Although it is right that your model overfits a little since epoch 280, it is not necessarily a bad thing provided that your validation accuracy is high. In general, most machine learning models will have higher training accuracy compared to validation accuracy, but this doesn't have to be bad.



                      In addition to the accuracy, Another thing to take into account is the loss in the test set. This tutorial explains the overfitting and underfitting very well https://www.tensorflow.org/tutorials/keras/overfit_and_underfit






                      share|improve this answer











                      $endgroup$



                      As long as your validation accuracy increases, you should keep training. I would stop when the test accuracy starts decreasing (this is known as early stopping). The general advise is always to keep the model that performs the best in your validation set.



                      Although it is right that your model overfits a little since epoch 280, it is not necessarily a bad thing provided that your validation accuracy is high. In general, most machine learning models will have higher training accuracy compared to validation accuracy, but this doesn't have to be bad.



                      In addition to the accuracy, Another thing to take into account is the loss in the test set. This tutorial explains the overfitting and underfitting very well https://www.tensorflow.org/tutorials/keras/overfit_and_underfit







                      share|improve this answer














                      share|improve this answer



                      share|improve this answer








                      edited 14 hours ago









                      jperezmartin

                      31




                      31










                      answered May 29 '18 at 12:23









                      David MasipDavid Masip

                      2,6211529




                      2,6211529























                          4












                          $begingroup$

                          "Early Stopping" is the concept which needs to be used here.
                          As mentioned in wikipedia about early stopping,




                          In machine learning, early stopping is a form of regularization used to avoid overfitting when training a learner with an iterative method, such as gradient descent. Such methods update the learner so as to make it better fit the training data with each iteration. Up to a point, this improves the learner's performance on data outside of the training set. Past that point, however, improving the learner's fit to the training data comes at the expense of increased generalization error. Early stopping rules provide guidance as to how many iterations can be run before the learner begins to over-fit. Early stopping rules have been employed in many different machine learning methods, with varying amounts of theoretical foundation.




                          At epoch > 280 in your graph, validation accuracy becomes lesser than training accuracy and hence it becomes a case of overfitting. In order to avoid overfitting here, training further is not recommended. However you may choose to train the model beyond the epoch where training and validation accuracy matches if the resulting validation accuracy is sufficient for the particular problem you are working on.






                          share|improve this answer









                          $endgroup$


















                            4












                            $begingroup$

                            "Early Stopping" is the concept which needs to be used here.
                            As mentioned in wikipedia about early stopping,




                            In machine learning, early stopping is a form of regularization used to avoid overfitting when training a learner with an iterative method, such as gradient descent. Such methods update the learner so as to make it better fit the training data with each iteration. Up to a point, this improves the learner's performance on data outside of the training set. Past that point, however, improving the learner's fit to the training data comes at the expense of increased generalization error. Early stopping rules provide guidance as to how many iterations can be run before the learner begins to over-fit. Early stopping rules have been employed in many different machine learning methods, with varying amounts of theoretical foundation.




                            At epoch > 280 in your graph, validation accuracy becomes lesser than training accuracy and hence it becomes a case of overfitting. In order to avoid overfitting here, training further is not recommended. However you may choose to train the model beyond the epoch where training and validation accuracy matches if the resulting validation accuracy is sufficient for the particular problem you are working on.






                            share|improve this answer









                            $endgroup$
















                              4












                              4








                              4





                              $begingroup$

                              "Early Stopping" is the concept which needs to be used here.
                              As mentioned in wikipedia about early stopping,




                              In machine learning, early stopping is a form of regularization used to avoid overfitting when training a learner with an iterative method, such as gradient descent. Such methods update the learner so as to make it better fit the training data with each iteration. Up to a point, this improves the learner's performance on data outside of the training set. Past that point, however, improving the learner's fit to the training data comes at the expense of increased generalization error. Early stopping rules provide guidance as to how many iterations can be run before the learner begins to over-fit. Early stopping rules have been employed in many different machine learning methods, with varying amounts of theoretical foundation.




                              At epoch > 280 in your graph, validation accuracy becomes lesser than training accuracy and hence it becomes a case of overfitting. In order to avoid overfitting here, training further is not recommended. However you may choose to train the model beyond the epoch where training and validation accuracy matches if the resulting validation accuracy is sufficient for the particular problem you are working on.






                              share|improve this answer









                              $endgroup$



                              "Early Stopping" is the concept which needs to be used here.
                              As mentioned in wikipedia about early stopping,




                              In machine learning, early stopping is a form of regularization used to avoid overfitting when training a learner with an iterative method, such as gradient descent. Such methods update the learner so as to make it better fit the training data with each iteration. Up to a point, this improves the learner's performance on data outside of the training set. Past that point, however, improving the learner's fit to the training data comes at the expense of increased generalization error. Early stopping rules provide guidance as to how many iterations can be run before the learner begins to over-fit. Early stopping rules have been employed in many different machine learning methods, with varying amounts of theoretical foundation.




                              At epoch > 280 in your graph, validation accuracy becomes lesser than training accuracy and hence it becomes a case of overfitting. In order to avoid overfitting here, training further is not recommended. However you may choose to train the model beyond the epoch where training and validation accuracy matches if the resulting validation accuracy is sufficient for the particular problem you are working on.







                              share|improve this answer












                              share|improve this answer



                              share|improve this answer










                              answered May 30 '18 at 6:37









                              Divyanshu ShekharDivyanshu Shekhar

                              184113




                              184113























                                  3












                                  $begingroup$

                                  Keep training until your validation accuracy saturates (or starts dropping). Since the accuracy increases slowly, try to increase your learning rate parameter etato force the network to converge faster to the optimum weights. Be aware though, if you increase it too much though, it will become unstable.






                                  share|improve this answer









                                  $endgroup$


















                                    3












                                    $begingroup$

                                    Keep training until your validation accuracy saturates (or starts dropping). Since the accuracy increases slowly, try to increase your learning rate parameter etato force the network to converge faster to the optimum weights. Be aware though, if you increase it too much though, it will become unstable.






                                    share|improve this answer









                                    $endgroup$
















                                      3












                                      3








                                      3





                                      $begingroup$

                                      Keep training until your validation accuracy saturates (or starts dropping). Since the accuracy increases slowly, try to increase your learning rate parameter etato force the network to converge faster to the optimum weights. Be aware though, if you increase it too much though, it will become unstable.






                                      share|improve this answer









                                      $endgroup$



                                      Keep training until your validation accuracy saturates (or starts dropping). Since the accuracy increases slowly, try to increase your learning rate parameter etato force the network to converge faster to the optimum weights. Be aware though, if you increase it too much though, it will become unstable.







                                      share|improve this answer












                                      share|improve this answer



                                      share|improve this answer










                                      answered May 29 '18 at 12:28









                                      pcko1pcko1

                                      1,681418




                                      1,681418























                                          2












                                          $begingroup$

                                          You should also look for training error Vs testing error than training accuracy and testing accuracy.






                                          share|improve this answer









                                          $endgroup$









                                          • 1




                                            $begingroup$
                                            isn't it the same?
                                            $endgroup$
                                            – Francesco Pegoraro
                                            Oct 6 '18 at 11:24
















                                          2












                                          $begingroup$

                                          You should also look for training error Vs testing error than training accuracy and testing accuracy.






                                          share|improve this answer









                                          $endgroup$









                                          • 1




                                            $begingroup$
                                            isn't it the same?
                                            $endgroup$
                                            – Francesco Pegoraro
                                            Oct 6 '18 at 11:24














                                          2












                                          2








                                          2





                                          $begingroup$

                                          You should also look for training error Vs testing error than training accuracy and testing accuracy.






                                          share|improve this answer









                                          $endgroup$



                                          You should also look for training error Vs testing error than training accuracy and testing accuracy.







                                          share|improve this answer












                                          share|improve this answer



                                          share|improve this answer










                                          answered Oct 6 '18 at 11:01









                                          Ashok Kumar JayaramanAshok Kumar Jayaraman

                                          1212




                                          1212








                                          • 1




                                            $begingroup$
                                            isn't it the same?
                                            $endgroup$
                                            – Francesco Pegoraro
                                            Oct 6 '18 at 11:24














                                          • 1




                                            $begingroup$
                                            isn't it the same?
                                            $endgroup$
                                            – Francesco Pegoraro
                                            Oct 6 '18 at 11:24








                                          1




                                          1




                                          $begingroup$
                                          isn't it the same?
                                          $endgroup$
                                          – Francesco Pegoraro
                                          Oct 6 '18 at 11:24




                                          $begingroup$
                                          isn't it the same?
                                          $endgroup$
                                          – Francesco Pegoraro
                                          Oct 6 '18 at 11:24











                                          0












                                          $begingroup$

                                          If you're using keras or tensorflow.keras, this parameter is known as patience in the EarlyStopping callback.



                                          It equals the number of epochs with no validation accuracy improvement to trigger the end of the training phase. I usually set it to 2 or 3, 1 is usually too sensitive to noise.






                                          share|improve this answer









                                          $endgroup$


















                                            0












                                            $begingroup$

                                            If you're using keras or tensorflow.keras, this parameter is known as patience in the EarlyStopping callback.



                                            It equals the number of epochs with no validation accuracy improvement to trigger the end of the training phase. I usually set it to 2 or 3, 1 is usually too sensitive to noise.






                                            share|improve this answer









                                            $endgroup$
















                                              0












                                              0








                                              0





                                              $begingroup$

                                              If you're using keras or tensorflow.keras, this parameter is known as patience in the EarlyStopping callback.



                                              It equals the number of epochs with no validation accuracy improvement to trigger the end of the training phase. I usually set it to 2 or 3, 1 is usually too sensitive to noise.






                                              share|improve this answer









                                              $endgroup$



                                              If you're using keras or tensorflow.keras, this parameter is known as patience in the EarlyStopping callback.



                                              It equals the number of epochs with no validation accuracy improvement to trigger the end of the training phase. I usually set it to 2 or 3, 1 is usually too sensitive to noise.







                                              share|improve this answer












                                              share|improve this answer



                                              share|improve this answer










                                              answered 8 hours ago









                                              Learning is a messLearning is a mess

                                              234211




                                              234211






























                                                  draft saved

                                                  draft discarded




















































                                                  Thanks for contributing an answer to Data Science Stack Exchange!


                                                  • Please be sure to answer the question. Provide details and share your research!

                                                  But avoid



                                                  • Asking for help, clarification, or responding to other answers.

                                                  • Making statements based on opinion; back them up with references or personal experience.


                                                  Use MathJax to format equations. MathJax reference.


                                                  To learn more, see our tips on writing great answers.




                                                  draft saved


                                                  draft discarded














                                                  StackExchange.ready(
                                                  function () {
                                                  StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f32306%2fin-which-epoch-should-i-stop-the-training-to-avoid-overfitting%23new-answer', 'question_page');
                                                  }
                                                  );

                                                  Post as a guest















                                                  Required, but never shown





















































                                                  Required, but never shown














                                                  Required, but never shown












                                                  Required, but never shown







                                                  Required, but never shown

































                                                  Required, but never shown














                                                  Required, but never shown












                                                  Required, but never shown







                                                  Required, but never shown







                                                  Popular posts from this blog

                                                  Callistus I

                                                  Tabula Rosettana

                                                  How to label and detect the document text images