Training Accuracy stuck in Keras
$begingroup$
I have trained a CNN using keras for Image classification with 3 classes. The results are bad and I'm trying to understand what the classifier has learnt and what it has not. It's only giving me an output of 1 class.
I have made changes to the learning rate, activation(relu, sigmoid and softmax for last layer), changed the architecture and the optimizer(SGD and Adam) but the training accuracy is stuck at ~33.33%. It's definitely not a coincidence because I only have 3 classes.
My present architecture is
[Conv -> Relu -Conv -> Relu -> MAxPool] * 3 -> Flatten -> [Dense -> Relu] * 2 -> Dense -> Softmax
My first 2 conv layers have 64 filters of size (3, 3) and the remaining conv layers have 32 filters of the same size.
My Dense layer goes like this 128 units-> 64 units -> 3 units.
I started with a simple model and made it more complex to improve it. But there has been no improvement after any of these changes.
I have used two activations 'relu' and 'sigmoid' for experimental purposes. I'm thinking of only using sigmoid and softmax for the last layer.
I have ~13000 images to train and 1400 for validation. The distribution is almost equal among the 3 classes.
I was using this syntax to add the activation. The summary didn't show any activation layers. And my network wasn't improving.
classifier = Sequential()
classifier.add(Conv2D(32, (3, 3), input_shape = (256, 256, 3), activation = 'relu'))
classifier.add(Conv2D(32, (3, 3), activation = 'relu'))
classifier.add(MaxPooling2D(pool_size = (2,2)))
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_11 (Conv2D) (None, 254, 254, 32) 896
_________________________________________________________________
conv2d_12 (Conv2D) (None, 252, 252, 32) 9248
_________________________________________________________________
max_pooling2d_6 (MaxPooling2 (None, 84, 84, 32) 0
_________________________________________________________________
conv2d_13 (Conv2D) (None, 82, 82, 32) 9248
_________________________________________________________________
conv2d_14 (Conv2D) (None, 80, 80, 32) 9248
_________________________________________________________________
max_pooling2d_7 (MaxPooling2 (None, 40, 40, 32) 0
_________________________________________________________________
conv2d_15 (Conv2D) (None, 38, 38, 32) 9248
_________________________________________________________________
conv2d_16 (Conv2D) (None, 36, 36, 32) 9248
_________________________________________________________________
max_pooling2d_8 (MaxPooling2 (None, 18, 18, 32) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 10368) 0
_________________________________________________________________
dense_6 (Dense) (None, 128) 1327232
_________________________________________________________________
dense_7 (Dense) (None, 64) 8256
_________________________________________________________________
dense_8 (Dense) (None, 3) 195
=================================================================
Total params: 1,382,819
Trainable params: 1,382,819
Non-trainable params: 0
Edit: updated Network. But when I add Activation as a new Layer the architecture changes. And now my network seems to work.
classifier = Sequential()
classifier.add(Conv2D(64, (3, 3), input_shape = (256, 256, 3))
classifier.add(Activation('relu'))
classifier.add(Conv2D(64, (3, 3))
classifier.add(Activation('relu'))
classifier.add(MaxPooling2D(pool_size = (2,2)))
Layer (type) Output Shape Param #
=================================================================
conv2d_13 (Conv2D) (None, 254, 254, 32) 896
_________________________________________________________________
activation_1 (Activation) (None, 254, 254, 32) 0
_________________________________________________________________
conv2d_14 (Conv2D) (None, 252, 252, 32) 9248
_________________________________________________________________
activation_2 (Activation) (None, 252, 252, 32) 0
_________________________________________________________________
max_pooling2d_7 (MaxPooling2 (None, 84, 84, 32) 0
_________________________________________________________________
conv2d_15 (Conv2D) (None, 82, 82, 32) 9248
_________________________________________________________________
activation_3 (Activation) (None, 82, 82, 32) 0
_________________________________________________________________
conv2d_16 (Conv2D) (None, 80, 80, 32) 9248
_________________________________________________________________
activation_4 (Activation) (None, 80, 80, 32) 0
_________________________________________________________________
max_pooling2d_8 (MaxPooling2 (None, 40, 40, 32) 0
_________________________________________________________________
flatten_2 (Flatten) (None, 51200) 0
_________________________________________________________________
dense_7 (Dense) (None, 128) 6553728
_________________________________________________________________
activation_5 (Activation) (None, 128) 0
_________________________________________________________________
dense_8 (Dense) (None, 3) 387
_________________________________________________________________
activation_6 (Activation) (None, 3) 0
=================================================================
Total params: 6,582,755
Trainable params: 6,582,755
Non-trainable params: 0
Directory structure:
Training_path
-Label1 Folder
-Label2 Folder
-Label3 Folder
I think that was the problem in my network. That the argument activation wasn't working as expected and no activations were performed on the network input. What I don't understand is that both syntax's are equivalent(according to the documentation) and yet are producing different results.
machine-learning python neural-network keras
$endgroup$
|
show 9 more comments
$begingroup$
I have trained a CNN using keras for Image classification with 3 classes. The results are bad and I'm trying to understand what the classifier has learnt and what it has not. It's only giving me an output of 1 class.
I have made changes to the learning rate, activation(relu, sigmoid and softmax for last layer), changed the architecture and the optimizer(SGD and Adam) but the training accuracy is stuck at ~33.33%. It's definitely not a coincidence because I only have 3 classes.
My present architecture is
[Conv -> Relu -Conv -> Relu -> MAxPool] * 3 -> Flatten -> [Dense -> Relu] * 2 -> Dense -> Softmax
My first 2 conv layers have 64 filters of size (3, 3) and the remaining conv layers have 32 filters of the same size.
My Dense layer goes like this 128 units-> 64 units -> 3 units.
I started with a simple model and made it more complex to improve it. But there has been no improvement after any of these changes.
I have used two activations 'relu' and 'sigmoid' for experimental purposes. I'm thinking of only using sigmoid and softmax for the last layer.
I have ~13000 images to train and 1400 for validation. The distribution is almost equal among the 3 classes.
I was using this syntax to add the activation. The summary didn't show any activation layers. And my network wasn't improving.
classifier = Sequential()
classifier.add(Conv2D(32, (3, 3), input_shape = (256, 256, 3), activation = 'relu'))
classifier.add(Conv2D(32, (3, 3), activation = 'relu'))
classifier.add(MaxPooling2D(pool_size = (2,2)))
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_11 (Conv2D) (None, 254, 254, 32) 896
_________________________________________________________________
conv2d_12 (Conv2D) (None, 252, 252, 32) 9248
_________________________________________________________________
max_pooling2d_6 (MaxPooling2 (None, 84, 84, 32) 0
_________________________________________________________________
conv2d_13 (Conv2D) (None, 82, 82, 32) 9248
_________________________________________________________________
conv2d_14 (Conv2D) (None, 80, 80, 32) 9248
_________________________________________________________________
max_pooling2d_7 (MaxPooling2 (None, 40, 40, 32) 0
_________________________________________________________________
conv2d_15 (Conv2D) (None, 38, 38, 32) 9248
_________________________________________________________________
conv2d_16 (Conv2D) (None, 36, 36, 32) 9248
_________________________________________________________________
max_pooling2d_8 (MaxPooling2 (None, 18, 18, 32) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 10368) 0
_________________________________________________________________
dense_6 (Dense) (None, 128) 1327232
_________________________________________________________________
dense_7 (Dense) (None, 64) 8256
_________________________________________________________________
dense_8 (Dense) (None, 3) 195
=================================================================
Total params: 1,382,819
Trainable params: 1,382,819
Non-trainable params: 0
Edit: updated Network. But when I add Activation as a new Layer the architecture changes. And now my network seems to work.
classifier = Sequential()
classifier.add(Conv2D(64, (3, 3), input_shape = (256, 256, 3))
classifier.add(Activation('relu'))
classifier.add(Conv2D(64, (3, 3))
classifier.add(Activation('relu'))
classifier.add(MaxPooling2D(pool_size = (2,2)))
Layer (type) Output Shape Param #
=================================================================
conv2d_13 (Conv2D) (None, 254, 254, 32) 896
_________________________________________________________________
activation_1 (Activation) (None, 254, 254, 32) 0
_________________________________________________________________
conv2d_14 (Conv2D) (None, 252, 252, 32) 9248
_________________________________________________________________
activation_2 (Activation) (None, 252, 252, 32) 0
_________________________________________________________________
max_pooling2d_7 (MaxPooling2 (None, 84, 84, 32) 0
_________________________________________________________________
conv2d_15 (Conv2D) (None, 82, 82, 32) 9248
_________________________________________________________________
activation_3 (Activation) (None, 82, 82, 32) 0
_________________________________________________________________
conv2d_16 (Conv2D) (None, 80, 80, 32) 9248
_________________________________________________________________
activation_4 (Activation) (None, 80, 80, 32) 0
_________________________________________________________________
max_pooling2d_8 (MaxPooling2 (None, 40, 40, 32) 0
_________________________________________________________________
flatten_2 (Flatten) (None, 51200) 0
_________________________________________________________________
dense_7 (Dense) (None, 128) 6553728
_________________________________________________________________
activation_5 (Activation) (None, 128) 0
_________________________________________________________________
dense_8 (Dense) (None, 3) 387
_________________________________________________________________
activation_6 (Activation) (None, 3) 0
=================================================================
Total params: 6,582,755
Trainable params: 6,582,755
Non-trainable params: 0
Directory structure:
Training_path
-Label1 Folder
-Label2 Folder
-Label3 Folder
I think that was the problem in my network. That the argument activation wasn't working as expected and no activations were performed on the network input. What I don't understand is that both syntax's are equivalent(according to the documentation) and yet are producing different results.
machine-learning python neural-network keras
$endgroup$
$begingroup$
Do you have class imbalance in the input? For example, if there are 100 images, how many belong to each of the 3 classes? Since you've added a dense layer with size 3, model is trained for 3 classes. Add details about % of each output class in the data, that can give us some clue.
$endgroup$
– hssay
Aug 17 '18 at 9:06
$begingroup$
@hssay I'm sure there is no class imbalance. I'm training set has 6080 images and each class has 2000 images at least.
$endgroup$
– Mohit Motwani
Aug 17 '18 at 9:27
$begingroup$
Any other noteworthy symptom? Is the error going down with number of epochs? Is learning rate so low that network is not learning anything? Have you tried training with higher learning rates?
$endgroup$
– hssay
Aug 17 '18 at 9:30
$begingroup$
Actually there is definitely something weird with the model. I have added more conv layers and decreased the receptive field and no. of filters. Changed the optimizer to SGD too. What I have noticed is that the training accuracy gets stucks at 0.3334 after few epochs or right from the beginning(depends on which optimizer or the learning rate I'm using). So yeah, the model is not learning behind 33 percent accuracy. Tried learning rates: 0.01, 0.001, 0.0001
$endgroup$
– Mohit Motwani
Aug 17 '18 at 9:34
1
$begingroup$
Okay, the data generator looks fine, the next thing you can do is visualize the predicted classification, here is a link to the blog that contains code to do the same : medium.com/@kylepob61392/… Scroll down to model evaluation. Also make sure once the data in the folders, if they have been appropriately grouped into the subfolders.
$endgroup$
– Nischal Hp
Aug 17 '18 at 11:38
|
show 9 more comments
$begingroup$
I have trained a CNN using keras for Image classification with 3 classes. The results are bad and I'm trying to understand what the classifier has learnt and what it has not. It's only giving me an output of 1 class.
I have made changes to the learning rate, activation(relu, sigmoid and softmax for last layer), changed the architecture and the optimizer(SGD and Adam) but the training accuracy is stuck at ~33.33%. It's definitely not a coincidence because I only have 3 classes.
My present architecture is
[Conv -> Relu -Conv -> Relu -> MAxPool] * 3 -> Flatten -> [Dense -> Relu] * 2 -> Dense -> Softmax
My first 2 conv layers have 64 filters of size (3, 3) and the remaining conv layers have 32 filters of the same size.
My Dense layer goes like this 128 units-> 64 units -> 3 units.
I started with a simple model and made it more complex to improve it. But there has been no improvement after any of these changes.
I have used two activations 'relu' and 'sigmoid' for experimental purposes. I'm thinking of only using sigmoid and softmax for the last layer.
I have ~13000 images to train and 1400 for validation. The distribution is almost equal among the 3 classes.
I was using this syntax to add the activation. The summary didn't show any activation layers. And my network wasn't improving.
classifier = Sequential()
classifier.add(Conv2D(32, (3, 3), input_shape = (256, 256, 3), activation = 'relu'))
classifier.add(Conv2D(32, (3, 3), activation = 'relu'))
classifier.add(MaxPooling2D(pool_size = (2,2)))
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_11 (Conv2D) (None, 254, 254, 32) 896
_________________________________________________________________
conv2d_12 (Conv2D) (None, 252, 252, 32) 9248
_________________________________________________________________
max_pooling2d_6 (MaxPooling2 (None, 84, 84, 32) 0
_________________________________________________________________
conv2d_13 (Conv2D) (None, 82, 82, 32) 9248
_________________________________________________________________
conv2d_14 (Conv2D) (None, 80, 80, 32) 9248
_________________________________________________________________
max_pooling2d_7 (MaxPooling2 (None, 40, 40, 32) 0
_________________________________________________________________
conv2d_15 (Conv2D) (None, 38, 38, 32) 9248
_________________________________________________________________
conv2d_16 (Conv2D) (None, 36, 36, 32) 9248
_________________________________________________________________
max_pooling2d_8 (MaxPooling2 (None, 18, 18, 32) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 10368) 0
_________________________________________________________________
dense_6 (Dense) (None, 128) 1327232
_________________________________________________________________
dense_7 (Dense) (None, 64) 8256
_________________________________________________________________
dense_8 (Dense) (None, 3) 195
=================================================================
Total params: 1,382,819
Trainable params: 1,382,819
Non-trainable params: 0
Edit: updated Network. But when I add Activation as a new Layer the architecture changes. And now my network seems to work.
classifier = Sequential()
classifier.add(Conv2D(64, (3, 3), input_shape = (256, 256, 3))
classifier.add(Activation('relu'))
classifier.add(Conv2D(64, (3, 3))
classifier.add(Activation('relu'))
classifier.add(MaxPooling2D(pool_size = (2,2)))
Layer (type) Output Shape Param #
=================================================================
conv2d_13 (Conv2D) (None, 254, 254, 32) 896
_________________________________________________________________
activation_1 (Activation) (None, 254, 254, 32) 0
_________________________________________________________________
conv2d_14 (Conv2D) (None, 252, 252, 32) 9248
_________________________________________________________________
activation_2 (Activation) (None, 252, 252, 32) 0
_________________________________________________________________
max_pooling2d_7 (MaxPooling2 (None, 84, 84, 32) 0
_________________________________________________________________
conv2d_15 (Conv2D) (None, 82, 82, 32) 9248
_________________________________________________________________
activation_3 (Activation) (None, 82, 82, 32) 0
_________________________________________________________________
conv2d_16 (Conv2D) (None, 80, 80, 32) 9248
_________________________________________________________________
activation_4 (Activation) (None, 80, 80, 32) 0
_________________________________________________________________
max_pooling2d_8 (MaxPooling2 (None, 40, 40, 32) 0
_________________________________________________________________
flatten_2 (Flatten) (None, 51200) 0
_________________________________________________________________
dense_7 (Dense) (None, 128) 6553728
_________________________________________________________________
activation_5 (Activation) (None, 128) 0
_________________________________________________________________
dense_8 (Dense) (None, 3) 387
_________________________________________________________________
activation_6 (Activation) (None, 3) 0
=================================================================
Total params: 6,582,755
Trainable params: 6,582,755
Non-trainable params: 0
Directory structure:
Training_path
-Label1 Folder
-Label2 Folder
-Label3 Folder
I think that was the problem in my network. That the argument activation wasn't working as expected and no activations were performed on the network input. What I don't understand is that both syntax's are equivalent(according to the documentation) and yet are producing different results.
machine-learning python neural-network keras
$endgroup$
I have trained a CNN using keras for Image classification with 3 classes. The results are bad and I'm trying to understand what the classifier has learnt and what it has not. It's only giving me an output of 1 class.
I have made changes to the learning rate, activation(relu, sigmoid and softmax for last layer), changed the architecture and the optimizer(SGD and Adam) but the training accuracy is stuck at ~33.33%. It's definitely not a coincidence because I only have 3 classes.
My present architecture is
[Conv -> Relu -Conv -> Relu -> MAxPool] * 3 -> Flatten -> [Dense -> Relu] * 2 -> Dense -> Softmax
My first 2 conv layers have 64 filters of size (3, 3) and the remaining conv layers have 32 filters of the same size.
My Dense layer goes like this 128 units-> 64 units -> 3 units.
I started with a simple model and made it more complex to improve it. But there has been no improvement after any of these changes.
I have used two activations 'relu' and 'sigmoid' for experimental purposes. I'm thinking of only using sigmoid and softmax for the last layer.
I have ~13000 images to train and 1400 for validation. The distribution is almost equal among the 3 classes.
I was using this syntax to add the activation. The summary didn't show any activation layers. And my network wasn't improving.
classifier = Sequential()
classifier.add(Conv2D(32, (3, 3), input_shape = (256, 256, 3), activation = 'relu'))
classifier.add(Conv2D(32, (3, 3), activation = 'relu'))
classifier.add(MaxPooling2D(pool_size = (2,2)))
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_11 (Conv2D) (None, 254, 254, 32) 896
_________________________________________________________________
conv2d_12 (Conv2D) (None, 252, 252, 32) 9248
_________________________________________________________________
max_pooling2d_6 (MaxPooling2 (None, 84, 84, 32) 0
_________________________________________________________________
conv2d_13 (Conv2D) (None, 82, 82, 32) 9248
_________________________________________________________________
conv2d_14 (Conv2D) (None, 80, 80, 32) 9248
_________________________________________________________________
max_pooling2d_7 (MaxPooling2 (None, 40, 40, 32) 0
_________________________________________________________________
conv2d_15 (Conv2D) (None, 38, 38, 32) 9248
_________________________________________________________________
conv2d_16 (Conv2D) (None, 36, 36, 32) 9248
_________________________________________________________________
max_pooling2d_8 (MaxPooling2 (None, 18, 18, 32) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 10368) 0
_________________________________________________________________
dense_6 (Dense) (None, 128) 1327232
_________________________________________________________________
dense_7 (Dense) (None, 64) 8256
_________________________________________________________________
dense_8 (Dense) (None, 3) 195
=================================================================
Total params: 1,382,819
Trainable params: 1,382,819
Non-trainable params: 0
Edit: updated Network. But when I add Activation as a new Layer the architecture changes. And now my network seems to work.
classifier = Sequential()
classifier.add(Conv2D(64, (3, 3), input_shape = (256, 256, 3))
classifier.add(Activation('relu'))
classifier.add(Conv2D(64, (3, 3))
classifier.add(Activation('relu'))
classifier.add(MaxPooling2D(pool_size = (2,2)))
Layer (type) Output Shape Param #
=================================================================
conv2d_13 (Conv2D) (None, 254, 254, 32) 896
_________________________________________________________________
activation_1 (Activation) (None, 254, 254, 32) 0
_________________________________________________________________
conv2d_14 (Conv2D) (None, 252, 252, 32) 9248
_________________________________________________________________
activation_2 (Activation) (None, 252, 252, 32) 0
_________________________________________________________________
max_pooling2d_7 (MaxPooling2 (None, 84, 84, 32) 0
_________________________________________________________________
conv2d_15 (Conv2D) (None, 82, 82, 32) 9248
_________________________________________________________________
activation_3 (Activation) (None, 82, 82, 32) 0
_________________________________________________________________
conv2d_16 (Conv2D) (None, 80, 80, 32) 9248
_________________________________________________________________
activation_4 (Activation) (None, 80, 80, 32) 0
_________________________________________________________________
max_pooling2d_8 (MaxPooling2 (None, 40, 40, 32) 0
_________________________________________________________________
flatten_2 (Flatten) (None, 51200) 0
_________________________________________________________________
dense_7 (Dense) (None, 128) 6553728
_________________________________________________________________
activation_5 (Activation) (None, 128) 0
_________________________________________________________________
dense_8 (Dense) (None, 3) 387
_________________________________________________________________
activation_6 (Activation) (None, 3) 0
=================================================================
Total params: 6,582,755
Trainable params: 6,582,755
Non-trainable params: 0
Directory structure:
Training_path
-Label1 Folder
-Label2 Folder
-Label3 Folder
I think that was the problem in my network. That the argument activation wasn't working as expected and no activations were performed on the network input. What I don't understand is that both syntax's are equivalent(according to the documentation) and yet are producing different results.
machine-learning python neural-network keras
machine-learning python neural-network keras
edited Aug 21 '18 at 8:31
Mohit Motwani
asked Aug 17 '18 at 6:06
Mohit MotwaniMohit Motwani
294215
294215
$begingroup$
Do you have class imbalance in the input? For example, if there are 100 images, how many belong to each of the 3 classes? Since you've added a dense layer with size 3, model is trained for 3 classes. Add details about % of each output class in the data, that can give us some clue.
$endgroup$
– hssay
Aug 17 '18 at 9:06
$begingroup$
@hssay I'm sure there is no class imbalance. I'm training set has 6080 images and each class has 2000 images at least.
$endgroup$
– Mohit Motwani
Aug 17 '18 at 9:27
$begingroup$
Any other noteworthy symptom? Is the error going down with number of epochs? Is learning rate so low that network is not learning anything? Have you tried training with higher learning rates?
$endgroup$
– hssay
Aug 17 '18 at 9:30
$begingroup$
Actually there is definitely something weird with the model. I have added more conv layers and decreased the receptive field and no. of filters. Changed the optimizer to SGD too. What I have noticed is that the training accuracy gets stucks at 0.3334 after few epochs or right from the beginning(depends on which optimizer or the learning rate I'm using). So yeah, the model is not learning behind 33 percent accuracy. Tried learning rates: 0.01, 0.001, 0.0001
$endgroup$
– Mohit Motwani
Aug 17 '18 at 9:34
1
$begingroup$
Okay, the data generator looks fine, the next thing you can do is visualize the predicted classification, here is a link to the blog that contains code to do the same : medium.com/@kylepob61392/… Scroll down to model evaluation. Also make sure once the data in the folders, if they have been appropriately grouped into the subfolders.
$endgroup$
– Nischal Hp
Aug 17 '18 at 11:38
|
show 9 more comments
$begingroup$
Do you have class imbalance in the input? For example, if there are 100 images, how many belong to each of the 3 classes? Since you've added a dense layer with size 3, model is trained for 3 classes. Add details about % of each output class in the data, that can give us some clue.
$endgroup$
– hssay
Aug 17 '18 at 9:06
$begingroup$
@hssay I'm sure there is no class imbalance. I'm training set has 6080 images and each class has 2000 images at least.
$endgroup$
– Mohit Motwani
Aug 17 '18 at 9:27
$begingroup$
Any other noteworthy symptom? Is the error going down with number of epochs? Is learning rate so low that network is not learning anything? Have you tried training with higher learning rates?
$endgroup$
– hssay
Aug 17 '18 at 9:30
$begingroup$
Actually there is definitely something weird with the model. I have added more conv layers and decreased the receptive field and no. of filters. Changed the optimizer to SGD too. What I have noticed is that the training accuracy gets stucks at 0.3334 after few epochs or right from the beginning(depends on which optimizer or the learning rate I'm using). So yeah, the model is not learning behind 33 percent accuracy. Tried learning rates: 0.01, 0.001, 0.0001
$endgroup$
– Mohit Motwani
Aug 17 '18 at 9:34
1
$begingroup$
Okay, the data generator looks fine, the next thing you can do is visualize the predicted classification, here is a link to the blog that contains code to do the same : medium.com/@kylepob61392/… Scroll down to model evaluation. Also make sure once the data in the folders, if they have been appropriately grouped into the subfolders.
$endgroup$
– Nischal Hp
Aug 17 '18 at 11:38
$begingroup$
Do you have class imbalance in the input? For example, if there are 100 images, how many belong to each of the 3 classes? Since you've added a dense layer with size 3, model is trained for 3 classes. Add details about % of each output class in the data, that can give us some clue.
$endgroup$
– hssay
Aug 17 '18 at 9:06
$begingroup$
Do you have class imbalance in the input? For example, if there are 100 images, how many belong to each of the 3 classes? Since you've added a dense layer with size 3, model is trained for 3 classes. Add details about % of each output class in the data, that can give us some clue.
$endgroup$
– hssay
Aug 17 '18 at 9:06
$begingroup$
@hssay I'm sure there is no class imbalance. I'm training set has 6080 images and each class has 2000 images at least.
$endgroup$
– Mohit Motwani
Aug 17 '18 at 9:27
$begingroup$
@hssay I'm sure there is no class imbalance. I'm training set has 6080 images and each class has 2000 images at least.
$endgroup$
– Mohit Motwani
Aug 17 '18 at 9:27
$begingroup$
Any other noteworthy symptom? Is the error going down with number of epochs? Is learning rate so low that network is not learning anything? Have you tried training with higher learning rates?
$endgroup$
– hssay
Aug 17 '18 at 9:30
$begingroup$
Any other noteworthy symptom? Is the error going down with number of epochs? Is learning rate so low that network is not learning anything? Have you tried training with higher learning rates?
$endgroup$
– hssay
Aug 17 '18 at 9:30
$begingroup$
Actually there is definitely something weird with the model. I have added more conv layers and decreased the receptive field and no. of filters. Changed the optimizer to SGD too. What I have noticed is that the training accuracy gets stucks at 0.3334 after few epochs or right from the beginning(depends on which optimizer or the learning rate I'm using). So yeah, the model is not learning behind 33 percent accuracy. Tried learning rates: 0.01, 0.001, 0.0001
$endgroup$
– Mohit Motwani
Aug 17 '18 at 9:34
$begingroup$
Actually there is definitely something weird with the model. I have added more conv layers and decreased the receptive field and no. of filters. Changed the optimizer to SGD too. What I have noticed is that the training accuracy gets stucks at 0.3334 after few epochs or right from the beginning(depends on which optimizer or the learning rate I'm using). So yeah, the model is not learning behind 33 percent accuracy. Tried learning rates: 0.01, 0.001, 0.0001
$endgroup$
– Mohit Motwani
Aug 17 '18 at 9:34
1
1
$begingroup$
Okay, the data generator looks fine, the next thing you can do is visualize the predicted classification, here is a link to the blog that contains code to do the same : medium.com/@kylepob61392/… Scroll down to model evaluation. Also make sure once the data in the folders, if they have been appropriately grouped into the subfolders.
$endgroup$
– Nischal Hp
Aug 17 '18 at 11:38
$begingroup$
Okay, the data generator looks fine, the next thing you can do is visualize the predicted classification, here is a link to the blog that contains code to do the same : medium.com/@kylepob61392/… Scroll down to model evaluation. Also make sure once the data in the folders, if they have been appropriately grouped into the subfolders.
$endgroup$
– Nischal Hp
Aug 17 '18 at 11:38
|
show 9 more comments
2 Answers
2
active
oldest
votes
$begingroup$
I faced a similar issue. One-hot encoding the target variable using nputils in Keras, solved the issue of accuracy and validation loss being stuck. Using weights for balancing the target classes further improved performance.
y_train_temp = y_train.copy()
y_val_temp = y_val.copy()
from keras.utils.np.utils import to_categorical
y_train = to_categorical(y_train)
y_val = to_categorical(y_val)
New contributor
Sonali Dasgupta is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
$endgroup$
add a comment |
$begingroup$
So there were two problems:
- When I tried to add the activations as a parameter
Conv2d( activation = 'relu'), my activation layers were not added as displayed in the summary in my question. But by usingadd(Activation('relu'))`, it worked as shown in the question. - But that didn't solve the problem. I used learning rates of
0.01, 0.007, 0.005which showed good training and validation accuracies for few epochs but plummeted to 33.33 again so probably overshooting. So I used a much lower learning rates of0.0001, which worked pretty well.
$endgroup$
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "557"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f37067%2ftraining-accuracy-stuck-in-keras%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
I faced a similar issue. One-hot encoding the target variable using nputils in Keras, solved the issue of accuracy and validation loss being stuck. Using weights for balancing the target classes further improved performance.
y_train_temp = y_train.copy()
y_val_temp = y_val.copy()
from keras.utils.np.utils import to_categorical
y_train = to_categorical(y_train)
y_val = to_categorical(y_val)
New contributor
Sonali Dasgupta is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
$endgroup$
add a comment |
$begingroup$
I faced a similar issue. One-hot encoding the target variable using nputils in Keras, solved the issue of accuracy and validation loss being stuck. Using weights for balancing the target classes further improved performance.
y_train_temp = y_train.copy()
y_val_temp = y_val.copy()
from keras.utils.np.utils import to_categorical
y_train = to_categorical(y_train)
y_val = to_categorical(y_val)
New contributor
Sonali Dasgupta is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
$endgroup$
add a comment |
$begingroup$
I faced a similar issue. One-hot encoding the target variable using nputils in Keras, solved the issue of accuracy and validation loss being stuck. Using weights for balancing the target classes further improved performance.
y_train_temp = y_train.copy()
y_val_temp = y_val.copy()
from keras.utils.np.utils import to_categorical
y_train = to_categorical(y_train)
y_val = to_categorical(y_val)
New contributor
Sonali Dasgupta is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
$endgroup$
I faced a similar issue. One-hot encoding the target variable using nputils in Keras, solved the issue of accuracy and validation loss being stuck. Using weights for balancing the target classes further improved performance.
y_train_temp = y_train.copy()
y_val_temp = y_val.copy()
from keras.utils.np.utils import to_categorical
y_train = to_categorical(y_train)
y_val = to_categorical(y_val)
New contributor
Sonali Dasgupta is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
edited yesterday
Siong Thye Goh
1,177418
1,177418
New contributor
Sonali Dasgupta is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
answered yesterday
Sonali DasguptaSonali Dasgupta
111
111
New contributor
Sonali Dasgupta is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
Sonali Dasgupta is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
Sonali Dasgupta is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
add a comment |
add a comment |
$begingroup$
So there were two problems:
- When I tried to add the activations as a parameter
Conv2d( activation = 'relu'), my activation layers were not added as displayed in the summary in my question. But by usingadd(Activation('relu'))`, it worked as shown in the question. - But that didn't solve the problem. I used learning rates of
0.01, 0.007, 0.005which showed good training and validation accuracies for few epochs but plummeted to 33.33 again so probably overshooting. So I used a much lower learning rates of0.0001, which worked pretty well.
$endgroup$
add a comment |
$begingroup$
So there were two problems:
- When I tried to add the activations as a parameter
Conv2d( activation = 'relu'), my activation layers were not added as displayed in the summary in my question. But by usingadd(Activation('relu'))`, it worked as shown in the question. - But that didn't solve the problem. I used learning rates of
0.01, 0.007, 0.005which showed good training and validation accuracies for few epochs but plummeted to 33.33 again so probably overshooting. So I used a much lower learning rates of0.0001, which worked pretty well.
$endgroup$
add a comment |
$begingroup$
So there were two problems:
- When I tried to add the activations as a parameter
Conv2d( activation = 'relu'), my activation layers were not added as displayed in the summary in my question. But by usingadd(Activation('relu'))`, it worked as shown in the question. - But that didn't solve the problem. I used learning rates of
0.01, 0.007, 0.005which showed good training and validation accuracies for few epochs but plummeted to 33.33 again so probably overshooting. So I used a much lower learning rates of0.0001, which worked pretty well.
$endgroup$
So there were two problems:
- When I tried to add the activations as a parameter
Conv2d( activation = 'relu'), my activation layers were not added as displayed in the summary in my question. But by usingadd(Activation('relu'))`, it worked as shown in the question. - But that didn't solve the problem. I used learning rates of
0.01, 0.007, 0.005which showed good training and validation accuracies for few epochs but plummeted to 33.33 again so probably overshooting. So I used a much lower learning rates of0.0001, which worked pretty well.
answered Aug 29 '18 at 6:31
Mohit MotwaniMohit Motwani
294215
294215
add a comment |
add a comment |
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f37067%2ftraining-accuracy-stuck-in-keras%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
Do you have class imbalance in the input? For example, if there are 100 images, how many belong to each of the 3 classes? Since you've added a dense layer with size 3, model is trained for 3 classes. Add details about % of each output class in the data, that can give us some clue.
$endgroup$
– hssay
Aug 17 '18 at 9:06
$begingroup$
@hssay I'm sure there is no class imbalance. I'm training set has 6080 images and each class has 2000 images at least.
$endgroup$
– Mohit Motwani
Aug 17 '18 at 9:27
$begingroup$
Any other noteworthy symptom? Is the error going down with number of epochs? Is learning rate so low that network is not learning anything? Have you tried training with higher learning rates?
$endgroup$
– hssay
Aug 17 '18 at 9:30
$begingroup$
Actually there is definitely something weird with the model. I have added more conv layers and decreased the receptive field and no. of filters. Changed the optimizer to SGD too. What I have noticed is that the training accuracy gets stucks at 0.3334 after few epochs or right from the beginning(depends on which optimizer or the learning rate I'm using). So yeah, the model is not learning behind 33 percent accuracy. Tried learning rates: 0.01, 0.001, 0.0001
$endgroup$
– Mohit Motwani
Aug 17 '18 at 9:34
1
$begingroup$
Okay, the data generator looks fine, the next thing you can do is visualize the predicted classification, here is a link to the blog that contains code to do the same : medium.com/@kylepob61392/… Scroll down to model evaluation. Also make sure once the data in the folders, if they have been appropriately grouped into the subfolders.
$endgroup$
– Nischal Hp
Aug 17 '18 at 11:38