Keras, DNN ending with sigmoid - model.predict produces values < 0.5. This indicates…?
$begingroup$
I'm trying a simple Keras project with Dense layers for binary classification. About 300000 rows of data, labels are
training_set['TARGET'].value_counts()
0 282686
1 24825
My model looks like this
def build_model():
model = models.Sequential()
model.add(layers.Dense(64, activation='relu', kernel_regularizer=regularizers.l2(0.001),
input_shape=(train_data.shape[1],)))
model.add(layers.Dropout(0.5))
model.add(layers.Dense(32, kernel_regularizer=regularizers.l2(0.001), activation='relu'))
model.add(layers.Dropout(0.5))
model.add(layers.Dense(1, activation='sigmoid'))
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
return model
So it's binary classification that ends with a sigmoid. It's my understanding that I should get values close to 0 or close to 1? I've tried different model architectures, hyperparameters, epochs, batch sizes, etc. but when I run model.predict on my validation set my values never get above 0.5. Here are some samples.
20 epochs, 16384 batch size
max 0.458850622177124, min 0.1022530049085617
max 0.47131556272506714, min 0.057787925004959106
20 epochs, 8192 batch size
max 0.42957592010498047, min 0.060324762016534805
max 0.3811708390712738, min 0.022215187549591064
20 epochs, 4096 batch size
max 0.3163970410823822, min 0.0657803937792778
20 epochs, 2048 batch size
max 0.21799422800540924, min 0.03832605481147766
Is this an indication that I'm doing something wrong?
Training and validation loss
keras
$endgroup$
bumped to the homepage by Community♦ 19 hours ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
add a comment |
$begingroup$
I'm trying a simple Keras project with Dense layers for binary classification. About 300000 rows of data, labels are
training_set['TARGET'].value_counts()
0 282686
1 24825
My model looks like this
def build_model():
model = models.Sequential()
model.add(layers.Dense(64, activation='relu', kernel_regularizer=regularizers.l2(0.001),
input_shape=(train_data.shape[1],)))
model.add(layers.Dropout(0.5))
model.add(layers.Dense(32, kernel_regularizer=regularizers.l2(0.001), activation='relu'))
model.add(layers.Dropout(0.5))
model.add(layers.Dense(1, activation='sigmoid'))
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
return model
So it's binary classification that ends with a sigmoid. It's my understanding that I should get values close to 0 or close to 1? I've tried different model architectures, hyperparameters, epochs, batch sizes, etc. but when I run model.predict on my validation set my values never get above 0.5. Here are some samples.
20 epochs, 16384 batch size
max 0.458850622177124, min 0.1022530049085617
max 0.47131556272506714, min 0.057787925004959106
20 epochs, 8192 batch size
max 0.42957592010498047, min 0.060324762016534805
max 0.3811708390712738, min 0.022215187549591064
20 epochs, 4096 batch size
max 0.3163970410823822, min 0.0657803937792778
20 epochs, 2048 batch size
max 0.21799422800540924, min 0.03832605481147766
Is this an indication that I'm doing something wrong?
Training and validation loss
keras
$endgroup$
bumped to the homepage by Community♦ 19 hours ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
add a comment |
$begingroup$
I'm trying a simple Keras project with Dense layers for binary classification. About 300000 rows of data, labels are
training_set['TARGET'].value_counts()
0 282686
1 24825
My model looks like this
def build_model():
model = models.Sequential()
model.add(layers.Dense(64, activation='relu', kernel_regularizer=regularizers.l2(0.001),
input_shape=(train_data.shape[1],)))
model.add(layers.Dropout(0.5))
model.add(layers.Dense(32, kernel_regularizer=regularizers.l2(0.001), activation='relu'))
model.add(layers.Dropout(0.5))
model.add(layers.Dense(1, activation='sigmoid'))
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
return model
So it's binary classification that ends with a sigmoid. It's my understanding that I should get values close to 0 or close to 1? I've tried different model architectures, hyperparameters, epochs, batch sizes, etc. but when I run model.predict on my validation set my values never get above 0.5. Here are some samples.
20 epochs, 16384 batch size
max 0.458850622177124, min 0.1022530049085617
max 0.47131556272506714, min 0.057787925004959106
20 epochs, 8192 batch size
max 0.42957592010498047, min 0.060324762016534805
max 0.3811708390712738, min 0.022215187549591064
20 epochs, 4096 batch size
max 0.3163970410823822, min 0.0657803937792778
20 epochs, 2048 batch size
max 0.21799422800540924, min 0.03832605481147766
Is this an indication that I'm doing something wrong?
Training and validation loss
keras
$endgroup$
I'm trying a simple Keras project with Dense layers for binary classification. About 300000 rows of data, labels are
training_set['TARGET'].value_counts()
0 282686
1 24825
My model looks like this
def build_model():
model = models.Sequential()
model.add(layers.Dense(64, activation='relu', kernel_regularizer=regularizers.l2(0.001),
input_shape=(train_data.shape[1],)))
model.add(layers.Dropout(0.5))
model.add(layers.Dense(32, kernel_regularizer=regularizers.l2(0.001), activation='relu'))
model.add(layers.Dropout(0.5))
model.add(layers.Dense(1, activation='sigmoid'))
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
return model
So it's binary classification that ends with a sigmoid. It's my understanding that I should get values close to 0 or close to 1? I've tried different model architectures, hyperparameters, epochs, batch sizes, etc. but when I run model.predict on my validation set my values never get above 0.5. Here are some samples.
20 epochs, 16384 batch size
max 0.458850622177124, min 0.1022530049085617
max 0.47131556272506714, min 0.057787925004959106
20 epochs, 8192 batch size
max 0.42957592010498047, min 0.060324762016534805
max 0.3811708390712738, min 0.022215187549591064
20 epochs, 4096 batch size
max 0.3163970410823822, min 0.0657803937792778
20 epochs, 2048 batch size
max 0.21799422800540924, min 0.03832605481147766
Is this an indication that I'm doing something wrong?
Training and validation loss
keras
keras
edited Sep 16 '18 at 1:43
rr_cook
asked Sep 16 '18 at 0:58
rr_cookrr_cook
112
112
bumped to the homepage by Community♦ 19 hours ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
bumped to the homepage by Community♦ 19 hours ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
I think the dropout is a bit high and if it's a binary classification, then why in the end a single node?
Make sure your target variable is having proper shape in case of softmax...(one hot/ to_categorical()
)
def build_model():
model = models.Sequential()
model.add(layers.Dense(64, activation='relu', kernel_regularizer=regularizers.l2(0.001),
input_shape=(train_data.shape[1],)))
model.add(layers.Dropout(0.3))
model.add(layers.Dense(32, kernel_regularizer=regularizers.l2(0.001), activation='relu'))
model.add(layers.Dropout(0.3))
model.add(layers.Dense(**num_classes**, activation='softmax'))
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
return model
To improve it further, you need to use some techniques, such as cross-validation, batch normalization, and increase the epochs(maybe).
$endgroup$
$begingroup$
Short answer, I end with a single node because this is basically my first deep learning project and Chollet's "Deep Learning with Python" ended its binary classification project with that layer. I thought binary classifications were one node with sigmoid, and multiclass was n nodes with softmax?
$endgroup$
– rr_cook
Sep 16 '18 at 1:33
$begingroup$
The quick Dropout change didn't make a difference. I added a typical training and validation loss pic in the hopes that it offers a clue. I'll try the other suggestions, thanks.
$endgroup$
– rr_cook
Sep 16 '18 at 1:44
$begingroup$
How are you passing your inputs/any preprocessing then? Add that and the compiling step.. Will help others to debug easily as I don't think there's anything wrong with the model building process except that I prefer softmax one as it lets me know model's behaviour...
$endgroup$
– Aditya
Sep 16 '18 at 1:57
$begingroup$
Numeric columns are being subtracted by the mean and divided by the std. Categorical columns (M/F etc.) are being one-hot encoded using pd.get_dummies.
$endgroup$
– rr_cook
Sep 16 '18 at 2:04
$begingroup$
Drop a sample of your data if it's too big! I will try experiments!
$endgroup$
– Aditya
Sep 17 '18 at 3:51
|
show 1 more comment
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "557"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f38311%2fkeras-dnn-ending-with-sigmoid-model-predict-produces-values-0-5-this-indic%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
I think the dropout is a bit high and if it's a binary classification, then why in the end a single node?
Make sure your target variable is having proper shape in case of softmax...(one hot/ to_categorical()
)
def build_model():
model = models.Sequential()
model.add(layers.Dense(64, activation='relu', kernel_regularizer=regularizers.l2(0.001),
input_shape=(train_data.shape[1],)))
model.add(layers.Dropout(0.3))
model.add(layers.Dense(32, kernel_regularizer=regularizers.l2(0.001), activation='relu'))
model.add(layers.Dropout(0.3))
model.add(layers.Dense(**num_classes**, activation='softmax'))
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
return model
To improve it further, you need to use some techniques, such as cross-validation, batch normalization, and increase the epochs(maybe).
$endgroup$
$begingroup$
Short answer, I end with a single node because this is basically my first deep learning project and Chollet's "Deep Learning with Python" ended its binary classification project with that layer. I thought binary classifications were one node with sigmoid, and multiclass was n nodes with softmax?
$endgroup$
– rr_cook
Sep 16 '18 at 1:33
$begingroup$
The quick Dropout change didn't make a difference. I added a typical training and validation loss pic in the hopes that it offers a clue. I'll try the other suggestions, thanks.
$endgroup$
– rr_cook
Sep 16 '18 at 1:44
$begingroup$
How are you passing your inputs/any preprocessing then? Add that and the compiling step.. Will help others to debug easily as I don't think there's anything wrong with the model building process except that I prefer softmax one as it lets me know model's behaviour...
$endgroup$
– Aditya
Sep 16 '18 at 1:57
$begingroup$
Numeric columns are being subtracted by the mean and divided by the std. Categorical columns (M/F etc.) are being one-hot encoded using pd.get_dummies.
$endgroup$
– rr_cook
Sep 16 '18 at 2:04
$begingroup$
Drop a sample of your data if it's too big! I will try experiments!
$endgroup$
– Aditya
Sep 17 '18 at 3:51
|
show 1 more comment
$begingroup$
I think the dropout is a bit high and if it's a binary classification, then why in the end a single node?
Make sure your target variable is having proper shape in case of softmax...(one hot/ to_categorical()
)
def build_model():
model = models.Sequential()
model.add(layers.Dense(64, activation='relu', kernel_regularizer=regularizers.l2(0.001),
input_shape=(train_data.shape[1],)))
model.add(layers.Dropout(0.3))
model.add(layers.Dense(32, kernel_regularizer=regularizers.l2(0.001), activation='relu'))
model.add(layers.Dropout(0.3))
model.add(layers.Dense(**num_classes**, activation='softmax'))
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
return model
To improve it further, you need to use some techniques, such as cross-validation, batch normalization, and increase the epochs(maybe).
$endgroup$
$begingroup$
Short answer, I end with a single node because this is basically my first deep learning project and Chollet's "Deep Learning with Python" ended its binary classification project with that layer. I thought binary classifications were one node with sigmoid, and multiclass was n nodes with softmax?
$endgroup$
– rr_cook
Sep 16 '18 at 1:33
$begingroup$
The quick Dropout change didn't make a difference. I added a typical training and validation loss pic in the hopes that it offers a clue. I'll try the other suggestions, thanks.
$endgroup$
– rr_cook
Sep 16 '18 at 1:44
$begingroup$
How are you passing your inputs/any preprocessing then? Add that and the compiling step.. Will help others to debug easily as I don't think there's anything wrong with the model building process except that I prefer softmax one as it lets me know model's behaviour...
$endgroup$
– Aditya
Sep 16 '18 at 1:57
$begingroup$
Numeric columns are being subtracted by the mean and divided by the std. Categorical columns (M/F etc.) are being one-hot encoded using pd.get_dummies.
$endgroup$
– rr_cook
Sep 16 '18 at 2:04
$begingroup$
Drop a sample of your data if it's too big! I will try experiments!
$endgroup$
– Aditya
Sep 17 '18 at 3:51
|
show 1 more comment
$begingroup$
I think the dropout is a bit high and if it's a binary classification, then why in the end a single node?
Make sure your target variable is having proper shape in case of softmax...(one hot/ to_categorical()
)
def build_model():
model = models.Sequential()
model.add(layers.Dense(64, activation='relu', kernel_regularizer=regularizers.l2(0.001),
input_shape=(train_data.shape[1],)))
model.add(layers.Dropout(0.3))
model.add(layers.Dense(32, kernel_regularizer=regularizers.l2(0.001), activation='relu'))
model.add(layers.Dropout(0.3))
model.add(layers.Dense(**num_classes**, activation='softmax'))
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
return model
To improve it further, you need to use some techniques, such as cross-validation, batch normalization, and increase the epochs(maybe).
$endgroup$
I think the dropout is a bit high and if it's a binary classification, then why in the end a single node?
Make sure your target variable is having proper shape in case of softmax...(one hot/ to_categorical()
)
def build_model():
model = models.Sequential()
model.add(layers.Dense(64, activation='relu', kernel_regularizer=regularizers.l2(0.001),
input_shape=(train_data.shape[1],)))
model.add(layers.Dropout(0.3))
model.add(layers.Dense(32, kernel_regularizer=regularizers.l2(0.001), activation='relu'))
model.add(layers.Dropout(0.3))
model.add(layers.Dense(**num_classes**, activation='softmax'))
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
return model
To improve it further, you need to use some techniques, such as cross-validation, batch normalization, and increase the epochs(maybe).
edited Sep 16 '18 at 1:51
answered Sep 16 '18 at 1:18
AdityaAditya
1,4341626
1,4341626
$begingroup$
Short answer, I end with a single node because this is basically my first deep learning project and Chollet's "Deep Learning with Python" ended its binary classification project with that layer. I thought binary classifications were one node with sigmoid, and multiclass was n nodes with softmax?
$endgroup$
– rr_cook
Sep 16 '18 at 1:33
$begingroup$
The quick Dropout change didn't make a difference. I added a typical training and validation loss pic in the hopes that it offers a clue. I'll try the other suggestions, thanks.
$endgroup$
– rr_cook
Sep 16 '18 at 1:44
$begingroup$
How are you passing your inputs/any preprocessing then? Add that and the compiling step.. Will help others to debug easily as I don't think there's anything wrong with the model building process except that I prefer softmax one as it lets me know model's behaviour...
$endgroup$
– Aditya
Sep 16 '18 at 1:57
$begingroup$
Numeric columns are being subtracted by the mean and divided by the std. Categorical columns (M/F etc.) are being one-hot encoded using pd.get_dummies.
$endgroup$
– rr_cook
Sep 16 '18 at 2:04
$begingroup$
Drop a sample of your data if it's too big! I will try experiments!
$endgroup$
– Aditya
Sep 17 '18 at 3:51
|
show 1 more comment
$begingroup$
Short answer, I end with a single node because this is basically my first deep learning project and Chollet's "Deep Learning with Python" ended its binary classification project with that layer. I thought binary classifications were one node with sigmoid, and multiclass was n nodes with softmax?
$endgroup$
– rr_cook
Sep 16 '18 at 1:33
$begingroup$
The quick Dropout change didn't make a difference. I added a typical training and validation loss pic in the hopes that it offers a clue. I'll try the other suggestions, thanks.
$endgroup$
– rr_cook
Sep 16 '18 at 1:44
$begingroup$
How are you passing your inputs/any preprocessing then? Add that and the compiling step.. Will help others to debug easily as I don't think there's anything wrong with the model building process except that I prefer softmax one as it lets me know model's behaviour...
$endgroup$
– Aditya
Sep 16 '18 at 1:57
$begingroup$
Numeric columns are being subtracted by the mean and divided by the std. Categorical columns (M/F etc.) are being one-hot encoded using pd.get_dummies.
$endgroup$
– rr_cook
Sep 16 '18 at 2:04
$begingroup$
Drop a sample of your data if it's too big! I will try experiments!
$endgroup$
– Aditya
Sep 17 '18 at 3:51
$begingroup$
Short answer, I end with a single node because this is basically my first deep learning project and Chollet's "Deep Learning with Python" ended its binary classification project with that layer. I thought binary classifications were one node with sigmoid, and multiclass was n nodes with softmax?
$endgroup$
– rr_cook
Sep 16 '18 at 1:33
$begingroup$
Short answer, I end with a single node because this is basically my first deep learning project and Chollet's "Deep Learning with Python" ended its binary classification project with that layer. I thought binary classifications were one node with sigmoid, and multiclass was n nodes with softmax?
$endgroup$
– rr_cook
Sep 16 '18 at 1:33
$begingroup$
The quick Dropout change didn't make a difference. I added a typical training and validation loss pic in the hopes that it offers a clue. I'll try the other suggestions, thanks.
$endgroup$
– rr_cook
Sep 16 '18 at 1:44
$begingroup$
The quick Dropout change didn't make a difference. I added a typical training and validation loss pic in the hopes that it offers a clue. I'll try the other suggestions, thanks.
$endgroup$
– rr_cook
Sep 16 '18 at 1:44
$begingroup$
How are you passing your inputs/any preprocessing then? Add that and the compiling step.. Will help others to debug easily as I don't think there's anything wrong with the model building process except that I prefer softmax one as it lets me know model's behaviour...
$endgroup$
– Aditya
Sep 16 '18 at 1:57
$begingroup$
How are you passing your inputs/any preprocessing then? Add that and the compiling step.. Will help others to debug easily as I don't think there's anything wrong with the model building process except that I prefer softmax one as it lets me know model's behaviour...
$endgroup$
– Aditya
Sep 16 '18 at 1:57
$begingroup$
Numeric columns are being subtracted by the mean and divided by the std. Categorical columns (M/F etc.) are being one-hot encoded using pd.get_dummies.
$endgroup$
– rr_cook
Sep 16 '18 at 2:04
$begingroup$
Numeric columns are being subtracted by the mean and divided by the std. Categorical columns (M/F etc.) are being one-hot encoded using pd.get_dummies.
$endgroup$
– rr_cook
Sep 16 '18 at 2:04
$begingroup$
Drop a sample of your data if it's too big! I will try experiments!
$endgroup$
– Aditya
Sep 17 '18 at 3:51
$begingroup$
Drop a sample of your data if it's too big! I will try experiments!
$endgroup$
– Aditya
Sep 17 '18 at 3:51
|
show 1 more comment
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f38311%2fkeras-dnn-ending-with-sigmoid-model-predict-produces-values-0-5-this-indic%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown