Is there any standard or normal range for the amount of LSTM loss function?
$begingroup$
I am working on a LSTM network that I get loss amounts around 4.7 e-4 . It seems adding more layers and increasing epochs don't help to decreasing it. I also using a Dropout = 0.2
for each of my layers and implemented all the jobs with Keras library.
I like to know about this loss amount? is this large or is OK> Are there any rule of thumb for loss?
And why I can't decrease my loss amount? Is there any problem here?
lstm loss-function
$endgroup$
|
show 3 more comments
$begingroup$
I am working on a LSTM network that I get loss amounts around 4.7 e-4 . It seems adding more layers and increasing epochs don't help to decreasing it. I also using a Dropout = 0.2
for each of my layers and implemented all the jobs with Keras library.
I like to know about this loss amount? is this large or is OK> Are there any rule of thumb for loss?
And why I can't decrease my loss amount? Is there any problem here?
lstm loss-function
$endgroup$
$begingroup$
0.2 is very very small. You are almost vanishing the signal. Try something like .85
$endgroup$
– Media
yesterday
$begingroup$
@Media: You mean I must eliminate 85% of my hidden units during each iteration?
$endgroup$
– user145959
yesterday
1
$begingroup$
You should keep them. .85 means you keep 85 percent of them.
$endgroup$
– Media
yesterday
$begingroup$
@Wow! I thought the opposite!
$endgroup$
– user145959
yesterday
$begingroup$
Can you give more details? What are your features like? Are you normalizing? BatchNorm? etc?
$endgroup$
– kylec123
yesterday
|
show 3 more comments
$begingroup$
I am working on a LSTM network that I get loss amounts around 4.7 e-4 . It seems adding more layers and increasing epochs don't help to decreasing it. I also using a Dropout = 0.2
for each of my layers and implemented all the jobs with Keras library.
I like to know about this loss amount? is this large or is OK> Are there any rule of thumb for loss?
And why I can't decrease my loss amount? Is there any problem here?
lstm loss-function
$endgroup$
I am working on a LSTM network that I get loss amounts around 4.7 e-4 . It seems adding more layers and increasing epochs don't help to decreasing it. I also using a Dropout = 0.2
for each of my layers and implemented all the jobs with Keras library.
I like to know about this loss amount? is this large or is OK> Are there any rule of thumb for loss?
And why I can't decrease my loss amount? Is there any problem here?
lstm loss-function
lstm loss-function
edited 16 hours ago
user145959
asked yesterday
user145959user145959
657
657
$begingroup$
0.2 is very very small. You are almost vanishing the signal. Try something like .85
$endgroup$
– Media
yesterday
$begingroup$
@Media: You mean I must eliminate 85% of my hidden units during each iteration?
$endgroup$
– user145959
yesterday
1
$begingroup$
You should keep them. .85 means you keep 85 percent of them.
$endgroup$
– Media
yesterday
$begingroup$
@Wow! I thought the opposite!
$endgroup$
– user145959
yesterday
$begingroup$
Can you give more details? What are your features like? Are you normalizing? BatchNorm? etc?
$endgroup$
– kylec123
yesterday
|
show 3 more comments
$begingroup$
0.2 is very very small. You are almost vanishing the signal. Try something like .85
$endgroup$
– Media
yesterday
$begingroup$
@Media: You mean I must eliminate 85% of my hidden units during each iteration?
$endgroup$
– user145959
yesterday
1
$begingroup$
You should keep them. .85 means you keep 85 percent of them.
$endgroup$
– Media
yesterday
$begingroup$
@Wow! I thought the opposite!
$endgroup$
– user145959
yesterday
$begingroup$
Can you give more details? What are your features like? Are you normalizing? BatchNorm? etc?
$endgroup$
– kylec123
yesterday
$begingroup$
0.2 is very very small. You are almost vanishing the signal. Try something like .85
$endgroup$
– Media
yesterday
$begingroup$
0.2 is very very small. You are almost vanishing the signal. Try something like .85
$endgroup$
– Media
yesterday
$begingroup$
@Media: You mean I must eliminate 85% of my hidden units during each iteration?
$endgroup$
– user145959
yesterday
$begingroup$
@Media: You mean I must eliminate 85% of my hidden units during each iteration?
$endgroup$
– user145959
yesterday
1
1
$begingroup$
You should keep them. .85 means you keep 85 percent of them.
$endgroup$
– Media
yesterday
$begingroup$
You should keep them. .85 means you keep 85 percent of them.
$endgroup$
– Media
yesterday
$begingroup$
@Wow! I thought the opposite!
$endgroup$
– user145959
yesterday
$begingroup$
@Wow! I thought the opposite!
$endgroup$
– user145959
yesterday
$begingroup$
Can you give more details? What are your features like? Are you normalizing? BatchNorm? etc?
$endgroup$
– kylec123
yesterday
$begingroup$
Can you give more details? What are your features like? Are you normalizing? BatchNorm? etc?
$endgroup$
– kylec123
yesterday
|
show 3 more comments
1 Answer
1
active
oldest
votes
$begingroup$
Although your loss function is an indication of how well the model is training, usually one uses other more intuitive metrics to assess how good the model is.
If you are looking to a classification problem, your loss function is most probably the cross entropy. In what regards the loss function what matters is to understand its behaviour during training, more than its value.
A loss function that reduces its value during training is an indication that the model is effectively training. The loss function will, at some point, start reducing its value, and that means that the model has arrived to a minimum. One need to understand also the interaction of the loss function in training and validation set, and how to detect things like overfitting. If you are not aware of that, there is plenty of literature about the topic.
To know how good a model is, I would use other metrics that give a better indication and intuition. For example, in a classification problem, pne can look to things like Precision, Recall, Accuracy (if the classes are not very unbalance) or even ROC AUC. If it is a regression problem, maybe you are more interested in Mean or Median Absolute Percentage Error (MdAPE or MAPE).
$endgroup$
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "557"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f46350%2fis-there-any-standard-or-normal-range-for-the-amount-of-lstm-loss-function%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Although your loss function is an indication of how well the model is training, usually one uses other more intuitive metrics to assess how good the model is.
If you are looking to a classification problem, your loss function is most probably the cross entropy. In what regards the loss function what matters is to understand its behaviour during training, more than its value.
A loss function that reduces its value during training is an indication that the model is effectively training. The loss function will, at some point, start reducing its value, and that means that the model has arrived to a minimum. One need to understand also the interaction of the loss function in training and validation set, and how to detect things like overfitting. If you are not aware of that, there is plenty of literature about the topic.
To know how good a model is, I would use other metrics that give a better indication and intuition. For example, in a classification problem, pne can look to things like Precision, Recall, Accuracy (if the classes are not very unbalance) or even ROC AUC. If it is a regression problem, maybe you are more interested in Mean or Median Absolute Percentage Error (MdAPE or MAPE).
$endgroup$
add a comment |
$begingroup$
Although your loss function is an indication of how well the model is training, usually one uses other more intuitive metrics to assess how good the model is.
If you are looking to a classification problem, your loss function is most probably the cross entropy. In what regards the loss function what matters is to understand its behaviour during training, more than its value.
A loss function that reduces its value during training is an indication that the model is effectively training. The loss function will, at some point, start reducing its value, and that means that the model has arrived to a minimum. One need to understand also the interaction of the loss function in training and validation set, and how to detect things like overfitting. If you are not aware of that, there is plenty of literature about the topic.
To know how good a model is, I would use other metrics that give a better indication and intuition. For example, in a classification problem, pne can look to things like Precision, Recall, Accuracy (if the classes are not very unbalance) or even ROC AUC. If it is a regression problem, maybe you are more interested in Mean or Median Absolute Percentage Error (MdAPE or MAPE).
$endgroup$
add a comment |
$begingroup$
Although your loss function is an indication of how well the model is training, usually one uses other more intuitive metrics to assess how good the model is.
If you are looking to a classification problem, your loss function is most probably the cross entropy. In what regards the loss function what matters is to understand its behaviour during training, more than its value.
A loss function that reduces its value during training is an indication that the model is effectively training. The loss function will, at some point, start reducing its value, and that means that the model has arrived to a minimum. One need to understand also the interaction of the loss function in training and validation set, and how to detect things like overfitting. If you are not aware of that, there is plenty of literature about the topic.
To know how good a model is, I would use other metrics that give a better indication and intuition. For example, in a classification problem, pne can look to things like Precision, Recall, Accuracy (if the classes are not very unbalance) or even ROC AUC. If it is a regression problem, maybe you are more interested in Mean or Median Absolute Percentage Error (MdAPE or MAPE).
$endgroup$
Although your loss function is an indication of how well the model is training, usually one uses other more intuitive metrics to assess how good the model is.
If you are looking to a classification problem, your loss function is most probably the cross entropy. In what regards the loss function what matters is to understand its behaviour during training, more than its value.
A loss function that reduces its value during training is an indication that the model is effectively training. The loss function will, at some point, start reducing its value, and that means that the model has arrived to a minimum. One need to understand also the interaction of the loss function in training and validation set, and how to detect things like overfitting. If you are not aware of that, there is plenty of literature about the topic.
To know how good a model is, I would use other metrics that give a better indication and intuition. For example, in a classification problem, pne can look to things like Precision, Recall, Accuracy (if the classes are not very unbalance) or even ROC AUC. If it is a regression problem, maybe you are more interested in Mean or Median Absolute Percentage Error (MdAPE or MAPE).
answered 12 hours ago
EscachatorEscachator
248111
248111
add a comment |
add a comment |
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f46350%2fis-there-any-standard-or-normal-range-for-the-amount-of-lstm-loss-function%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
0.2 is very very small. You are almost vanishing the signal. Try something like .85
$endgroup$
– Media
yesterday
$begingroup$
@Media: You mean I must eliminate 85% of my hidden units during each iteration?
$endgroup$
– user145959
yesterday
1
$begingroup$
You should keep them. .85 means you keep 85 percent of them.
$endgroup$
– Media
yesterday
$begingroup$
@Wow! I thought the opposite!
$endgroup$
– user145959
yesterday
$begingroup$
Can you give more details? What are your features like? Are you normalizing? BatchNorm? etc?
$endgroup$
– kylec123
yesterday