Validation data for multi-series stateful LSTM
$begingroup$
With stateful LSTM the network state is propagated to subsequent sequences and batches. I have multiple data files with data that I present to the network for training (making this multi-series). My question is how to create the validation data.
At the moment I take 20% of the data from either head or tail of each file and use that for the validation data, and the other 80% I submit to the NN.
However, because this is stateful, I wonder if that's not a good way to create my validation data. To elaborate; when training I potentially present several years or a decade of data, so when it comes to the final sequence the network, it has the state built up from previous years of data. If I present some validation data which is maybe just a week of data, then the network doesn't build up any prior state to use to generate the final output. Am I right in thinking this isn't a good way to validate a stateful RNN/LSTM?
What I'm thinking instead is to put aside 20% of the data files I have, and use the entire data in those files to create the validation loss. Would that be better?
neural-network lstm recurrent-neural-net
$endgroup$
bumped to the homepage by Community♦ yesterday
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
add a comment |
$begingroup$
With stateful LSTM the network state is propagated to subsequent sequences and batches. I have multiple data files with data that I present to the network for training (making this multi-series). My question is how to create the validation data.
At the moment I take 20% of the data from either head or tail of each file and use that for the validation data, and the other 80% I submit to the NN.
However, because this is stateful, I wonder if that's not a good way to create my validation data. To elaborate; when training I potentially present several years or a decade of data, so when it comes to the final sequence the network, it has the state built up from previous years of data. If I present some validation data which is maybe just a week of data, then the network doesn't build up any prior state to use to generate the final output. Am I right in thinking this isn't a good way to validate a stateful RNN/LSTM?
What I'm thinking instead is to put aside 20% of the data files I have, and use the entire data in those files to create the validation loss. Would that be better?
neural-network lstm recurrent-neural-net
$endgroup$
bumped to the homepage by Community♦ yesterday
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
add a comment |
$begingroup$
With stateful LSTM the network state is propagated to subsequent sequences and batches. I have multiple data files with data that I present to the network for training (making this multi-series). My question is how to create the validation data.
At the moment I take 20% of the data from either head or tail of each file and use that for the validation data, and the other 80% I submit to the NN.
However, because this is stateful, I wonder if that's not a good way to create my validation data. To elaborate; when training I potentially present several years or a decade of data, so when it comes to the final sequence the network, it has the state built up from previous years of data. If I present some validation data which is maybe just a week of data, then the network doesn't build up any prior state to use to generate the final output. Am I right in thinking this isn't a good way to validate a stateful RNN/LSTM?
What I'm thinking instead is to put aside 20% of the data files I have, and use the entire data in those files to create the validation loss. Would that be better?
neural-network lstm recurrent-neural-net
$endgroup$
With stateful LSTM the network state is propagated to subsequent sequences and batches. I have multiple data files with data that I present to the network for training (making this multi-series). My question is how to create the validation data.
At the moment I take 20% of the data from either head or tail of each file and use that for the validation data, and the other 80% I submit to the NN.
However, because this is stateful, I wonder if that's not a good way to create my validation data. To elaborate; when training I potentially present several years or a decade of data, so when it comes to the final sequence the network, it has the state built up from previous years of data. If I present some validation data which is maybe just a week of data, then the network doesn't build up any prior state to use to generate the final output. Am I right in thinking this isn't a good way to validate a stateful RNN/LSTM?
What I'm thinking instead is to put aside 20% of the data files I have, and use the entire data in those files to create the validation loss. Would that be better?
neural-network lstm recurrent-neural-net
neural-network lstm recurrent-neural-net
asked Jul 10 '18 at 8:09
BigBadMeBigBadMe
30310
30310
bumped to the homepage by Community♦ yesterday
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
bumped to the homepage by Community♦ yesterday
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
Using a stateful model, am I correct in assuming that you have time-series data?
If so, it would perhaps make sense (at least for your test accuracy) to always validate against the time-step that immediately follows a test batch. This is what you would actually want to be doing with the model once it is trained.
Have a look at this answer, where I explain the idea in a lot more detail. The main idea is to split data into a kind of rolling forecast pattern, whereby you would have a test batch e.g. with something like:
window_size = 100 #whatever makes sense for your data
val_size = 15
train_batch1 = data[0:window_size]
val_batch1 = data[window_size:window_size + val_size]
train_batch2 = data[1:window_size+1]
val_batch2 = data[window_size + 1:window_size + val_size + 1]
...
That is just pseudo code to make it as clear as possible. Here is the diagram I created for the post linked above, which visualises the same notion:
In the image, it just has train/test sets... you could of course add validation into the mix.
If this doesn't seem to make sense for you data, temporal relationships must not matter too much. In that case, perhaps a stateful model is also not entirely useful.
$endgroup$
$begingroup$
Thanks for the answer and sorry for the delayed reply. Yes, it's time-series. I understand your answer, and that's what I do now, but when testing or using in real-life, if I only input 10% of data, then no prior state has been established in the network. I would have thought for stateful you might want to present a more significant amount of data so the network can build up a state in order to form a prediction? But your info re: train/test window answered another of my questions here: datascience.stackexchange.com/questions/34151 (you might want to post as the answer)
$endgroup$
– BigBadMe
Jul 17 '18 at 20:00
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "557"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f34233%2fvalidation-data-for-multi-series-stateful-lstm%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Using a stateful model, am I correct in assuming that you have time-series data?
If so, it would perhaps make sense (at least for your test accuracy) to always validate against the time-step that immediately follows a test batch. This is what you would actually want to be doing with the model once it is trained.
Have a look at this answer, where I explain the idea in a lot more detail. The main idea is to split data into a kind of rolling forecast pattern, whereby you would have a test batch e.g. with something like:
window_size = 100 #whatever makes sense for your data
val_size = 15
train_batch1 = data[0:window_size]
val_batch1 = data[window_size:window_size + val_size]
train_batch2 = data[1:window_size+1]
val_batch2 = data[window_size + 1:window_size + val_size + 1]
...
That is just pseudo code to make it as clear as possible. Here is the diagram I created for the post linked above, which visualises the same notion:
In the image, it just has train/test sets... you could of course add validation into the mix.
If this doesn't seem to make sense for you data, temporal relationships must not matter too much. In that case, perhaps a stateful model is also not entirely useful.
$endgroup$
$begingroup$
Thanks for the answer and sorry for the delayed reply. Yes, it's time-series. I understand your answer, and that's what I do now, but when testing or using in real-life, if I only input 10% of data, then no prior state has been established in the network. I would have thought for stateful you might want to present a more significant amount of data so the network can build up a state in order to form a prediction? But your info re: train/test window answered another of my questions here: datascience.stackexchange.com/questions/34151 (you might want to post as the answer)
$endgroup$
– BigBadMe
Jul 17 '18 at 20:00
add a comment |
$begingroup$
Using a stateful model, am I correct in assuming that you have time-series data?
If so, it would perhaps make sense (at least for your test accuracy) to always validate against the time-step that immediately follows a test batch. This is what you would actually want to be doing with the model once it is trained.
Have a look at this answer, where I explain the idea in a lot more detail. The main idea is to split data into a kind of rolling forecast pattern, whereby you would have a test batch e.g. with something like:
window_size = 100 #whatever makes sense for your data
val_size = 15
train_batch1 = data[0:window_size]
val_batch1 = data[window_size:window_size + val_size]
train_batch2 = data[1:window_size+1]
val_batch2 = data[window_size + 1:window_size + val_size + 1]
...
That is just pseudo code to make it as clear as possible. Here is the diagram I created for the post linked above, which visualises the same notion:
In the image, it just has train/test sets... you could of course add validation into the mix.
If this doesn't seem to make sense for you data, temporal relationships must not matter too much. In that case, perhaps a stateful model is also not entirely useful.
$endgroup$
$begingroup$
Thanks for the answer and sorry for the delayed reply. Yes, it's time-series. I understand your answer, and that's what I do now, but when testing or using in real-life, if I only input 10% of data, then no prior state has been established in the network. I would have thought for stateful you might want to present a more significant amount of data so the network can build up a state in order to form a prediction? But your info re: train/test window answered another of my questions here: datascience.stackexchange.com/questions/34151 (you might want to post as the answer)
$endgroup$
– BigBadMe
Jul 17 '18 at 20:00
add a comment |
$begingroup$
Using a stateful model, am I correct in assuming that you have time-series data?
If so, it would perhaps make sense (at least for your test accuracy) to always validate against the time-step that immediately follows a test batch. This is what you would actually want to be doing with the model once it is trained.
Have a look at this answer, where I explain the idea in a lot more detail. The main idea is to split data into a kind of rolling forecast pattern, whereby you would have a test batch e.g. with something like:
window_size = 100 #whatever makes sense for your data
val_size = 15
train_batch1 = data[0:window_size]
val_batch1 = data[window_size:window_size + val_size]
train_batch2 = data[1:window_size+1]
val_batch2 = data[window_size + 1:window_size + val_size + 1]
...
That is just pseudo code to make it as clear as possible. Here is the diagram I created for the post linked above, which visualises the same notion:
In the image, it just has train/test sets... you could of course add validation into the mix.
If this doesn't seem to make sense for you data, temporal relationships must not matter too much. In that case, perhaps a stateful model is also not entirely useful.
$endgroup$
Using a stateful model, am I correct in assuming that you have time-series data?
If so, it would perhaps make sense (at least for your test accuracy) to always validate against the time-step that immediately follows a test batch. This is what you would actually want to be doing with the model once it is trained.
Have a look at this answer, where I explain the idea in a lot more detail. The main idea is to split data into a kind of rolling forecast pattern, whereby you would have a test batch e.g. with something like:
window_size = 100 #whatever makes sense for your data
val_size = 15
train_batch1 = data[0:window_size]
val_batch1 = data[window_size:window_size + val_size]
train_batch2 = data[1:window_size+1]
val_batch2 = data[window_size + 1:window_size + val_size + 1]
...
That is just pseudo code to make it as clear as possible. Here is the diagram I created for the post linked above, which visualises the same notion:
In the image, it just has train/test sets... you could of course add validation into the mix.
If this doesn't seem to make sense for you data, temporal relationships must not matter too much. In that case, perhaps a stateful model is also not entirely useful.
answered Jul 10 '18 at 8:49
n1k31t4n1k31t4
6,4912421
6,4912421
$begingroup$
Thanks for the answer and sorry for the delayed reply. Yes, it's time-series. I understand your answer, and that's what I do now, but when testing or using in real-life, if I only input 10% of data, then no prior state has been established in the network. I would have thought for stateful you might want to present a more significant amount of data so the network can build up a state in order to form a prediction? But your info re: train/test window answered another of my questions here: datascience.stackexchange.com/questions/34151 (you might want to post as the answer)
$endgroup$
– BigBadMe
Jul 17 '18 at 20:00
add a comment |
$begingroup$
Thanks for the answer and sorry for the delayed reply. Yes, it's time-series. I understand your answer, and that's what I do now, but when testing or using in real-life, if I only input 10% of data, then no prior state has been established in the network. I would have thought for stateful you might want to present a more significant amount of data so the network can build up a state in order to form a prediction? But your info re: train/test window answered another of my questions here: datascience.stackexchange.com/questions/34151 (you might want to post as the answer)
$endgroup$
– BigBadMe
Jul 17 '18 at 20:00
$begingroup$
Thanks for the answer and sorry for the delayed reply. Yes, it's time-series. I understand your answer, and that's what I do now, but when testing or using in real-life, if I only input 10% of data, then no prior state has been established in the network. I would have thought for stateful you might want to present a more significant amount of data so the network can build up a state in order to form a prediction? But your info re: train/test window answered another of my questions here: datascience.stackexchange.com/questions/34151 (you might want to post as the answer)
$endgroup$
– BigBadMe
Jul 17 '18 at 20:00
$begingroup$
Thanks for the answer and sorry for the delayed reply. Yes, it's time-series. I understand your answer, and that's what I do now, but when testing or using in real-life, if I only input 10% of data, then no prior state has been established in the network. I would have thought for stateful you might want to present a more significant amount of data so the network can build up a state in order to form a prediction? But your info re: train/test window answered another of my questions here: datascience.stackexchange.com/questions/34151 (you might want to post as the answer)
$endgroup$
– BigBadMe
Jul 17 '18 at 20:00
add a comment |
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f34233%2fvalidation-data-for-multi-series-stateful-lstm%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown