Validation data for multi-series stateful LSTM












0












$begingroup$


With stateful LSTM the network state is propagated to subsequent sequences and batches. I have multiple data files with data that I present to the network for training (making this multi-series). My question is how to create the validation data.



At the moment I take 20% of the data from either head or tail of each file and use that for the validation data, and the other 80% I submit to the NN.



However, because this is stateful, I wonder if that's not a good way to create my validation data. To elaborate; when training I potentially present several years or a decade of data, so when it comes to the final sequence the network, it has the state built up from previous years of data. If I present some validation data which is maybe just a week of data, then the network doesn't build up any prior state to use to generate the final output. Am I right in thinking this isn't a good way to validate a stateful RNN/LSTM?



What I'm thinking instead is to put aside 20% of the data files I have, and use the entire data in those files to create the validation loss. Would that be better?










share|improve this question









$endgroup$




bumped to the homepage by Community yesterday


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.




















    0












    $begingroup$


    With stateful LSTM the network state is propagated to subsequent sequences and batches. I have multiple data files with data that I present to the network for training (making this multi-series). My question is how to create the validation data.



    At the moment I take 20% of the data from either head or tail of each file and use that for the validation data, and the other 80% I submit to the NN.



    However, because this is stateful, I wonder if that's not a good way to create my validation data. To elaborate; when training I potentially present several years or a decade of data, so when it comes to the final sequence the network, it has the state built up from previous years of data. If I present some validation data which is maybe just a week of data, then the network doesn't build up any prior state to use to generate the final output. Am I right in thinking this isn't a good way to validate a stateful RNN/LSTM?



    What I'm thinking instead is to put aside 20% of the data files I have, and use the entire data in those files to create the validation loss. Would that be better?










    share|improve this question









    $endgroup$




    bumped to the homepage by Community yesterday


    This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.


















      0












      0








      0


      1



      $begingroup$


      With stateful LSTM the network state is propagated to subsequent sequences and batches. I have multiple data files with data that I present to the network for training (making this multi-series). My question is how to create the validation data.



      At the moment I take 20% of the data from either head or tail of each file and use that for the validation data, and the other 80% I submit to the NN.



      However, because this is stateful, I wonder if that's not a good way to create my validation data. To elaborate; when training I potentially present several years or a decade of data, so when it comes to the final sequence the network, it has the state built up from previous years of data. If I present some validation data which is maybe just a week of data, then the network doesn't build up any prior state to use to generate the final output. Am I right in thinking this isn't a good way to validate a stateful RNN/LSTM?



      What I'm thinking instead is to put aside 20% of the data files I have, and use the entire data in those files to create the validation loss. Would that be better?










      share|improve this question









      $endgroup$




      With stateful LSTM the network state is propagated to subsequent sequences and batches. I have multiple data files with data that I present to the network for training (making this multi-series). My question is how to create the validation data.



      At the moment I take 20% of the data from either head or tail of each file and use that for the validation data, and the other 80% I submit to the NN.



      However, because this is stateful, I wonder if that's not a good way to create my validation data. To elaborate; when training I potentially present several years or a decade of data, so when it comes to the final sequence the network, it has the state built up from previous years of data. If I present some validation data which is maybe just a week of data, then the network doesn't build up any prior state to use to generate the final output. Am I right in thinking this isn't a good way to validate a stateful RNN/LSTM?



      What I'm thinking instead is to put aside 20% of the data files I have, and use the entire data in those files to create the validation loss. Would that be better?







      neural-network lstm recurrent-neural-net






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Jul 10 '18 at 8:09









      BigBadMeBigBadMe

      30310




      30310





      bumped to the homepage by Community yesterday


      This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.







      bumped to the homepage by Community yesterday


      This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
























          1 Answer
          1






          active

          oldest

          votes


















          0












          $begingroup$

          Using a stateful model, am I correct in assuming that you have time-series data?



          If so, it would perhaps make sense (at least for your test accuracy) to always validate against the time-step that immediately follows a test batch. This is what you would actually want to be doing with the model once it is trained.



          Have a look at this answer, where I explain the idea in a lot more detail. The main idea is to split data into a kind of rolling forecast pattern, whereby you would have a test batch e.g. with something like:



          window_size = 100     #whatever makes sense for your data
          val_size = 15

          train_batch1 = data[0:window_size]
          val_batch1 = data[window_size:window_size + val_size]

          train_batch2 = data[1:window_size+1]
          val_batch2 = data[window_size + 1:window_size + val_size + 1]

          ...


          That is just pseudo code to make it as clear as possible. Here is the diagram I created for the post linked above, which visualises the same notion:



          rolling window validation



          In the image, it just has train/test sets... you could of course add validation into the mix.



          If this doesn't seem to make sense for you data, temporal relationships must not matter too much. In that case, perhaps a stateful model is also not entirely useful.






          share|improve this answer









          $endgroup$













          • $begingroup$
            Thanks for the answer and sorry for the delayed reply. Yes, it's time-series. I understand your answer, and that's what I do now, but when testing or using in real-life, if I only input 10% of data, then no prior state has been established in the network. I would have thought for stateful you might want to present a more significant amount of data so the network can build up a state in order to form a prediction? But your info re: train/test window answered another of my questions here: datascience.stackexchange.com/questions/34151 (you might want to post as the answer)
            $endgroup$
            – BigBadMe
            Jul 17 '18 at 20:00












          Your Answer





          StackExchange.ifUsing("editor", function () {
          return StackExchange.using("mathjaxEditing", function () {
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
          });
          });
          }, "mathjax-editing");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "557"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f34233%2fvalidation-data-for-multi-series-stateful-lstm%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          0












          $begingroup$

          Using a stateful model, am I correct in assuming that you have time-series data?



          If so, it would perhaps make sense (at least for your test accuracy) to always validate against the time-step that immediately follows a test batch. This is what you would actually want to be doing with the model once it is trained.



          Have a look at this answer, where I explain the idea in a lot more detail. The main idea is to split data into a kind of rolling forecast pattern, whereby you would have a test batch e.g. with something like:



          window_size = 100     #whatever makes sense for your data
          val_size = 15

          train_batch1 = data[0:window_size]
          val_batch1 = data[window_size:window_size + val_size]

          train_batch2 = data[1:window_size+1]
          val_batch2 = data[window_size + 1:window_size + val_size + 1]

          ...


          That is just pseudo code to make it as clear as possible. Here is the diagram I created for the post linked above, which visualises the same notion:



          rolling window validation



          In the image, it just has train/test sets... you could of course add validation into the mix.



          If this doesn't seem to make sense for you data, temporal relationships must not matter too much. In that case, perhaps a stateful model is also not entirely useful.






          share|improve this answer









          $endgroup$













          • $begingroup$
            Thanks for the answer and sorry for the delayed reply. Yes, it's time-series. I understand your answer, and that's what I do now, but when testing or using in real-life, if I only input 10% of data, then no prior state has been established in the network. I would have thought for stateful you might want to present a more significant amount of data so the network can build up a state in order to form a prediction? But your info re: train/test window answered another of my questions here: datascience.stackexchange.com/questions/34151 (you might want to post as the answer)
            $endgroup$
            – BigBadMe
            Jul 17 '18 at 20:00
















          0












          $begingroup$

          Using a stateful model, am I correct in assuming that you have time-series data?



          If so, it would perhaps make sense (at least for your test accuracy) to always validate against the time-step that immediately follows a test batch. This is what you would actually want to be doing with the model once it is trained.



          Have a look at this answer, where I explain the idea in a lot more detail. The main idea is to split data into a kind of rolling forecast pattern, whereby you would have a test batch e.g. with something like:



          window_size = 100     #whatever makes sense for your data
          val_size = 15

          train_batch1 = data[0:window_size]
          val_batch1 = data[window_size:window_size + val_size]

          train_batch2 = data[1:window_size+1]
          val_batch2 = data[window_size + 1:window_size + val_size + 1]

          ...


          That is just pseudo code to make it as clear as possible. Here is the diagram I created for the post linked above, which visualises the same notion:



          rolling window validation



          In the image, it just has train/test sets... you could of course add validation into the mix.



          If this doesn't seem to make sense for you data, temporal relationships must not matter too much. In that case, perhaps a stateful model is also not entirely useful.






          share|improve this answer









          $endgroup$













          • $begingroup$
            Thanks for the answer and sorry for the delayed reply. Yes, it's time-series. I understand your answer, and that's what I do now, but when testing or using in real-life, if I only input 10% of data, then no prior state has been established in the network. I would have thought for stateful you might want to present a more significant amount of data so the network can build up a state in order to form a prediction? But your info re: train/test window answered another of my questions here: datascience.stackexchange.com/questions/34151 (you might want to post as the answer)
            $endgroup$
            – BigBadMe
            Jul 17 '18 at 20:00














          0












          0








          0





          $begingroup$

          Using a stateful model, am I correct in assuming that you have time-series data?



          If so, it would perhaps make sense (at least for your test accuracy) to always validate against the time-step that immediately follows a test batch. This is what you would actually want to be doing with the model once it is trained.



          Have a look at this answer, where I explain the idea in a lot more detail. The main idea is to split data into a kind of rolling forecast pattern, whereby you would have a test batch e.g. with something like:



          window_size = 100     #whatever makes sense for your data
          val_size = 15

          train_batch1 = data[0:window_size]
          val_batch1 = data[window_size:window_size + val_size]

          train_batch2 = data[1:window_size+1]
          val_batch2 = data[window_size + 1:window_size + val_size + 1]

          ...


          That is just pseudo code to make it as clear as possible. Here is the diagram I created for the post linked above, which visualises the same notion:



          rolling window validation



          In the image, it just has train/test sets... you could of course add validation into the mix.



          If this doesn't seem to make sense for you data, temporal relationships must not matter too much. In that case, perhaps a stateful model is also not entirely useful.






          share|improve this answer









          $endgroup$



          Using a stateful model, am I correct in assuming that you have time-series data?



          If so, it would perhaps make sense (at least for your test accuracy) to always validate against the time-step that immediately follows a test batch. This is what you would actually want to be doing with the model once it is trained.



          Have a look at this answer, where I explain the idea in a lot more detail. The main idea is to split data into a kind of rolling forecast pattern, whereby you would have a test batch e.g. with something like:



          window_size = 100     #whatever makes sense for your data
          val_size = 15

          train_batch1 = data[0:window_size]
          val_batch1 = data[window_size:window_size + val_size]

          train_batch2 = data[1:window_size+1]
          val_batch2 = data[window_size + 1:window_size + val_size + 1]

          ...


          That is just pseudo code to make it as clear as possible. Here is the diagram I created for the post linked above, which visualises the same notion:



          rolling window validation



          In the image, it just has train/test sets... you could of course add validation into the mix.



          If this doesn't seem to make sense for you data, temporal relationships must not matter too much. In that case, perhaps a stateful model is also not entirely useful.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Jul 10 '18 at 8:49









          n1k31t4n1k31t4

          6,4912421




          6,4912421












          • $begingroup$
            Thanks for the answer and sorry for the delayed reply. Yes, it's time-series. I understand your answer, and that's what I do now, but when testing or using in real-life, if I only input 10% of data, then no prior state has been established in the network. I would have thought for stateful you might want to present a more significant amount of data so the network can build up a state in order to form a prediction? But your info re: train/test window answered another of my questions here: datascience.stackexchange.com/questions/34151 (you might want to post as the answer)
            $endgroup$
            – BigBadMe
            Jul 17 '18 at 20:00


















          • $begingroup$
            Thanks for the answer and sorry for the delayed reply. Yes, it's time-series. I understand your answer, and that's what I do now, but when testing or using in real-life, if I only input 10% of data, then no prior state has been established in the network. I would have thought for stateful you might want to present a more significant amount of data so the network can build up a state in order to form a prediction? But your info re: train/test window answered another of my questions here: datascience.stackexchange.com/questions/34151 (you might want to post as the answer)
            $endgroup$
            – BigBadMe
            Jul 17 '18 at 20:00
















          $begingroup$
          Thanks for the answer and sorry for the delayed reply. Yes, it's time-series. I understand your answer, and that's what I do now, but when testing or using in real-life, if I only input 10% of data, then no prior state has been established in the network. I would have thought for stateful you might want to present a more significant amount of data so the network can build up a state in order to form a prediction? But your info re: train/test window answered another of my questions here: datascience.stackexchange.com/questions/34151 (you might want to post as the answer)
          $endgroup$
          – BigBadMe
          Jul 17 '18 at 20:00




          $begingroup$
          Thanks for the answer and sorry for the delayed reply. Yes, it's time-series. I understand your answer, and that's what I do now, but when testing or using in real-life, if I only input 10% of data, then no prior state has been established in the network. I would have thought for stateful you might want to present a more significant amount of data so the network can build up a state in order to form a prediction? But your info re: train/test window answered another of my questions here: datascience.stackexchange.com/questions/34151 (you might want to post as the answer)
          $endgroup$
          – BigBadMe
          Jul 17 '18 at 20:00


















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Data Science Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          Use MathJax to format equations. MathJax reference.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f34233%2fvalidation-data-for-multi-series-stateful-lstm%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          How to label and detect the document text images

          Vallis Paradisi

          Tabula Rosettana