Training Accuracy stuck in Keras












2












$begingroup$


I have trained a CNN using keras for Image classification with 3 classes. The results are bad and I'm trying to understand what the classifier has learnt and what it has not. It's only giving me an output of 1 class.



I have made changes to the learning rate, activation(relu, sigmoid and softmax for last layer), changed the architecture and the optimizer(SGD and Adam) but the training accuracy is stuck at ~33.33%. It's definitely not a coincidence because I only have 3 classes.



My present architecture is



[Conv -> Relu -Conv -> Relu -> MAxPool] * 3 -> Flatten -> [Dense -> Relu] * 2 -> Dense -> Softmax


My first 2 conv layers have 64 filters of size (3, 3) and the remaining conv layers have 32 filters of the same size.



My Dense layer goes like this 128 units-> 64 units -> 3 units.



I started with a simple model and made it more complex to improve it. But there has been no improvement after any of these changes.



I have used two activations 'relu' and 'sigmoid' for experimental purposes. I'm thinking of only using sigmoid and softmax for the last layer.



I have ~13000 images to train and 1400 for validation. The distribution is almost equal among the 3 classes.



I was using this syntax to add the activation. The summary didn't show any activation layers. And my network wasn't improving.



classifier = Sequential()
classifier.add(Conv2D(32, (3, 3), input_shape = (256, 256, 3), activation = 'relu'))
classifier.add(Conv2D(32, (3, 3), activation = 'relu'))
classifier.add(MaxPooling2D(pool_size = (2,2)))

_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_11 (Conv2D) (None, 254, 254, 32) 896
_________________________________________________________________
conv2d_12 (Conv2D) (None, 252, 252, 32) 9248
_________________________________________________________________
max_pooling2d_6 (MaxPooling2 (None, 84, 84, 32) 0
_________________________________________________________________
conv2d_13 (Conv2D) (None, 82, 82, 32) 9248
_________________________________________________________________
conv2d_14 (Conv2D) (None, 80, 80, 32) 9248
_________________________________________________________________
max_pooling2d_7 (MaxPooling2 (None, 40, 40, 32) 0
_________________________________________________________________
conv2d_15 (Conv2D) (None, 38, 38, 32) 9248
_________________________________________________________________
conv2d_16 (Conv2D) (None, 36, 36, 32) 9248
_________________________________________________________________
max_pooling2d_8 (MaxPooling2 (None, 18, 18, 32) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 10368) 0
_________________________________________________________________
dense_6 (Dense) (None, 128) 1327232
_________________________________________________________________
dense_7 (Dense) (None, 64) 8256
_________________________________________________________________
dense_8 (Dense) (None, 3) 195
=================================================================
Total params: 1,382,819
Trainable params: 1,382,819
Non-trainable params: 0


Edit: updated Network. But when I add Activation as a new Layer the architecture changes. And now my network seems to work.



classifier = Sequential()
classifier.add(Conv2D(64, (3, 3), input_shape = (256, 256, 3))
classifier.add(Activation('relu'))
classifier.add(Conv2D(64, (3, 3))
classifier.add(Activation('relu'))
classifier.add(MaxPooling2D(pool_size = (2,2)))




Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_13 (Conv2D) (None, 254, 254, 32) 896
_________________________________________________________________
activation_1 (Activation) (None, 254, 254, 32) 0
_________________________________________________________________
conv2d_14 (Conv2D) (None, 252, 252, 32) 9248
_________________________________________________________________
activation_2 (Activation) (None, 252, 252, 32) 0
_________________________________________________________________
max_pooling2d_7 (MaxPooling2 (None, 84, 84, 32) 0
_________________________________________________________________
conv2d_15 (Conv2D) (None, 82, 82, 32) 9248
_________________________________________________________________
activation_3 (Activation) (None, 82, 82, 32) 0
_________________________________________________________________
conv2d_16 (Conv2D) (None, 80, 80, 32) 9248
_________________________________________________________________
activation_4 (Activation) (None, 80, 80, 32) 0
_________________________________________________________________
max_pooling2d_8 (MaxPooling2 (None, 40, 40, 32) 0
_________________________________________________________________
flatten_2 (Flatten) (None, 51200) 0
_________________________________________________________________
dense_7 (Dense) (None, 128) 6553728
_________________________________________________________________
activation_5 (Activation) (None, 128) 0
_________________________________________________________________
dense_8 (Dense) (None, 3) 387
_________________________________________________________________
activation_6 (Activation) (None, 3) 0
=================================================================
Total params: 6,582,755
Trainable params: 6,582,755
Non-trainable params: 0


Directory structure:



Training_path
-Label1 Folder
-Label2 Folder
-Label3 Folder


I think that was the problem in my network. That the argument activation wasn't working as expected and no activations were performed on the network input. What I don't understand is that both syntax's are equivalent(according to the documentation) and yet are producing different results.










share|improve this question











$endgroup$












  • $begingroup$
    Do you have class imbalance in the input? For example, if there are 100 images, how many belong to each of the 3 classes? Since you've added a dense layer with size 3, model is trained for 3 classes. Add details about % of each output class in the data, that can give us some clue.
    $endgroup$
    – hssay
    Aug 17 '18 at 9:06










  • $begingroup$
    @hssay I'm sure there is no class imbalance. I'm training set has 6080 images and each class has 2000 images at least.
    $endgroup$
    – Mohit Motwani
    Aug 17 '18 at 9:27










  • $begingroup$
    Any other noteworthy symptom? Is the error going down with number of epochs? Is learning rate so low that network is not learning anything? Have you tried training with higher learning rates?
    $endgroup$
    – hssay
    Aug 17 '18 at 9:30










  • $begingroup$
    Actually there is definitely something weird with the model. I have added more conv layers and decreased the receptive field and no. of filters. Changed the optimizer to SGD too. What I have noticed is that the training accuracy gets stucks at 0.3334 after few epochs or right from the beginning(depends on which optimizer or the learning rate I'm using). So yeah, the model is not learning behind 33 percent accuracy. Tried learning rates: 0.01, 0.001, 0.0001
    $endgroup$
    – Mohit Motwani
    Aug 17 '18 at 9:34






  • 1




    $begingroup$
    Okay, the data generator looks fine, the next thing you can do is visualize the predicted classification, here is a link to the blog that contains code to do the same : medium.com/@kylepob61392/… Scroll down to model evaluation. Also make sure once the data in the folders, if they have been appropriately grouped into the subfolders.
    $endgroup$
    – Nischal Hp
    Aug 17 '18 at 11:38
















2












$begingroup$


I have trained a CNN using keras for Image classification with 3 classes. The results are bad and I'm trying to understand what the classifier has learnt and what it has not. It's only giving me an output of 1 class.



I have made changes to the learning rate, activation(relu, sigmoid and softmax for last layer), changed the architecture and the optimizer(SGD and Adam) but the training accuracy is stuck at ~33.33%. It's definitely not a coincidence because I only have 3 classes.



My present architecture is



[Conv -> Relu -Conv -> Relu -> MAxPool] * 3 -> Flatten -> [Dense -> Relu] * 2 -> Dense -> Softmax


My first 2 conv layers have 64 filters of size (3, 3) and the remaining conv layers have 32 filters of the same size.



My Dense layer goes like this 128 units-> 64 units -> 3 units.



I started with a simple model and made it more complex to improve it. But there has been no improvement after any of these changes.



I have used two activations 'relu' and 'sigmoid' for experimental purposes. I'm thinking of only using sigmoid and softmax for the last layer.



I have ~13000 images to train and 1400 for validation. The distribution is almost equal among the 3 classes.



I was using this syntax to add the activation. The summary didn't show any activation layers. And my network wasn't improving.



classifier = Sequential()
classifier.add(Conv2D(32, (3, 3), input_shape = (256, 256, 3), activation = 'relu'))
classifier.add(Conv2D(32, (3, 3), activation = 'relu'))
classifier.add(MaxPooling2D(pool_size = (2,2)))

_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_11 (Conv2D) (None, 254, 254, 32) 896
_________________________________________________________________
conv2d_12 (Conv2D) (None, 252, 252, 32) 9248
_________________________________________________________________
max_pooling2d_6 (MaxPooling2 (None, 84, 84, 32) 0
_________________________________________________________________
conv2d_13 (Conv2D) (None, 82, 82, 32) 9248
_________________________________________________________________
conv2d_14 (Conv2D) (None, 80, 80, 32) 9248
_________________________________________________________________
max_pooling2d_7 (MaxPooling2 (None, 40, 40, 32) 0
_________________________________________________________________
conv2d_15 (Conv2D) (None, 38, 38, 32) 9248
_________________________________________________________________
conv2d_16 (Conv2D) (None, 36, 36, 32) 9248
_________________________________________________________________
max_pooling2d_8 (MaxPooling2 (None, 18, 18, 32) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 10368) 0
_________________________________________________________________
dense_6 (Dense) (None, 128) 1327232
_________________________________________________________________
dense_7 (Dense) (None, 64) 8256
_________________________________________________________________
dense_8 (Dense) (None, 3) 195
=================================================================
Total params: 1,382,819
Trainable params: 1,382,819
Non-trainable params: 0


Edit: updated Network. But when I add Activation as a new Layer the architecture changes. And now my network seems to work.



classifier = Sequential()
classifier.add(Conv2D(64, (3, 3), input_shape = (256, 256, 3))
classifier.add(Activation('relu'))
classifier.add(Conv2D(64, (3, 3))
classifier.add(Activation('relu'))
classifier.add(MaxPooling2D(pool_size = (2,2)))




Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_13 (Conv2D) (None, 254, 254, 32) 896
_________________________________________________________________
activation_1 (Activation) (None, 254, 254, 32) 0
_________________________________________________________________
conv2d_14 (Conv2D) (None, 252, 252, 32) 9248
_________________________________________________________________
activation_2 (Activation) (None, 252, 252, 32) 0
_________________________________________________________________
max_pooling2d_7 (MaxPooling2 (None, 84, 84, 32) 0
_________________________________________________________________
conv2d_15 (Conv2D) (None, 82, 82, 32) 9248
_________________________________________________________________
activation_3 (Activation) (None, 82, 82, 32) 0
_________________________________________________________________
conv2d_16 (Conv2D) (None, 80, 80, 32) 9248
_________________________________________________________________
activation_4 (Activation) (None, 80, 80, 32) 0
_________________________________________________________________
max_pooling2d_8 (MaxPooling2 (None, 40, 40, 32) 0
_________________________________________________________________
flatten_2 (Flatten) (None, 51200) 0
_________________________________________________________________
dense_7 (Dense) (None, 128) 6553728
_________________________________________________________________
activation_5 (Activation) (None, 128) 0
_________________________________________________________________
dense_8 (Dense) (None, 3) 387
_________________________________________________________________
activation_6 (Activation) (None, 3) 0
=================================================================
Total params: 6,582,755
Trainable params: 6,582,755
Non-trainable params: 0


Directory structure:



Training_path
-Label1 Folder
-Label2 Folder
-Label3 Folder


I think that was the problem in my network. That the argument activation wasn't working as expected and no activations were performed on the network input. What I don't understand is that both syntax's are equivalent(according to the documentation) and yet are producing different results.










share|improve this question











$endgroup$












  • $begingroup$
    Do you have class imbalance in the input? For example, if there are 100 images, how many belong to each of the 3 classes? Since you've added a dense layer with size 3, model is trained for 3 classes. Add details about % of each output class in the data, that can give us some clue.
    $endgroup$
    – hssay
    Aug 17 '18 at 9:06










  • $begingroup$
    @hssay I'm sure there is no class imbalance. I'm training set has 6080 images and each class has 2000 images at least.
    $endgroup$
    – Mohit Motwani
    Aug 17 '18 at 9:27










  • $begingroup$
    Any other noteworthy symptom? Is the error going down with number of epochs? Is learning rate so low that network is not learning anything? Have you tried training with higher learning rates?
    $endgroup$
    – hssay
    Aug 17 '18 at 9:30










  • $begingroup$
    Actually there is definitely something weird with the model. I have added more conv layers and decreased the receptive field and no. of filters. Changed the optimizer to SGD too. What I have noticed is that the training accuracy gets stucks at 0.3334 after few epochs or right from the beginning(depends on which optimizer or the learning rate I'm using). So yeah, the model is not learning behind 33 percent accuracy. Tried learning rates: 0.01, 0.001, 0.0001
    $endgroup$
    – Mohit Motwani
    Aug 17 '18 at 9:34






  • 1




    $begingroup$
    Okay, the data generator looks fine, the next thing you can do is visualize the predicted classification, here is a link to the blog that contains code to do the same : medium.com/@kylepob61392/… Scroll down to model evaluation. Also make sure once the data in the folders, if they have been appropriately grouped into the subfolders.
    $endgroup$
    – Nischal Hp
    Aug 17 '18 at 11:38














2












2








2





$begingroup$


I have trained a CNN using keras for Image classification with 3 classes. The results are bad and I'm trying to understand what the classifier has learnt and what it has not. It's only giving me an output of 1 class.



I have made changes to the learning rate, activation(relu, sigmoid and softmax for last layer), changed the architecture and the optimizer(SGD and Adam) but the training accuracy is stuck at ~33.33%. It's definitely not a coincidence because I only have 3 classes.



My present architecture is



[Conv -> Relu -Conv -> Relu -> MAxPool] * 3 -> Flatten -> [Dense -> Relu] * 2 -> Dense -> Softmax


My first 2 conv layers have 64 filters of size (3, 3) and the remaining conv layers have 32 filters of the same size.



My Dense layer goes like this 128 units-> 64 units -> 3 units.



I started with a simple model and made it more complex to improve it. But there has been no improvement after any of these changes.



I have used two activations 'relu' and 'sigmoid' for experimental purposes. I'm thinking of only using sigmoid and softmax for the last layer.



I have ~13000 images to train and 1400 for validation. The distribution is almost equal among the 3 classes.



I was using this syntax to add the activation. The summary didn't show any activation layers. And my network wasn't improving.



classifier = Sequential()
classifier.add(Conv2D(32, (3, 3), input_shape = (256, 256, 3), activation = 'relu'))
classifier.add(Conv2D(32, (3, 3), activation = 'relu'))
classifier.add(MaxPooling2D(pool_size = (2,2)))

_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_11 (Conv2D) (None, 254, 254, 32) 896
_________________________________________________________________
conv2d_12 (Conv2D) (None, 252, 252, 32) 9248
_________________________________________________________________
max_pooling2d_6 (MaxPooling2 (None, 84, 84, 32) 0
_________________________________________________________________
conv2d_13 (Conv2D) (None, 82, 82, 32) 9248
_________________________________________________________________
conv2d_14 (Conv2D) (None, 80, 80, 32) 9248
_________________________________________________________________
max_pooling2d_7 (MaxPooling2 (None, 40, 40, 32) 0
_________________________________________________________________
conv2d_15 (Conv2D) (None, 38, 38, 32) 9248
_________________________________________________________________
conv2d_16 (Conv2D) (None, 36, 36, 32) 9248
_________________________________________________________________
max_pooling2d_8 (MaxPooling2 (None, 18, 18, 32) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 10368) 0
_________________________________________________________________
dense_6 (Dense) (None, 128) 1327232
_________________________________________________________________
dense_7 (Dense) (None, 64) 8256
_________________________________________________________________
dense_8 (Dense) (None, 3) 195
=================================================================
Total params: 1,382,819
Trainable params: 1,382,819
Non-trainable params: 0


Edit: updated Network. But when I add Activation as a new Layer the architecture changes. And now my network seems to work.



classifier = Sequential()
classifier.add(Conv2D(64, (3, 3), input_shape = (256, 256, 3))
classifier.add(Activation('relu'))
classifier.add(Conv2D(64, (3, 3))
classifier.add(Activation('relu'))
classifier.add(MaxPooling2D(pool_size = (2,2)))




Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_13 (Conv2D) (None, 254, 254, 32) 896
_________________________________________________________________
activation_1 (Activation) (None, 254, 254, 32) 0
_________________________________________________________________
conv2d_14 (Conv2D) (None, 252, 252, 32) 9248
_________________________________________________________________
activation_2 (Activation) (None, 252, 252, 32) 0
_________________________________________________________________
max_pooling2d_7 (MaxPooling2 (None, 84, 84, 32) 0
_________________________________________________________________
conv2d_15 (Conv2D) (None, 82, 82, 32) 9248
_________________________________________________________________
activation_3 (Activation) (None, 82, 82, 32) 0
_________________________________________________________________
conv2d_16 (Conv2D) (None, 80, 80, 32) 9248
_________________________________________________________________
activation_4 (Activation) (None, 80, 80, 32) 0
_________________________________________________________________
max_pooling2d_8 (MaxPooling2 (None, 40, 40, 32) 0
_________________________________________________________________
flatten_2 (Flatten) (None, 51200) 0
_________________________________________________________________
dense_7 (Dense) (None, 128) 6553728
_________________________________________________________________
activation_5 (Activation) (None, 128) 0
_________________________________________________________________
dense_8 (Dense) (None, 3) 387
_________________________________________________________________
activation_6 (Activation) (None, 3) 0
=================================================================
Total params: 6,582,755
Trainable params: 6,582,755
Non-trainable params: 0


Directory structure:



Training_path
-Label1 Folder
-Label2 Folder
-Label3 Folder


I think that was the problem in my network. That the argument activation wasn't working as expected and no activations were performed on the network input. What I don't understand is that both syntax's are equivalent(according to the documentation) and yet are producing different results.










share|improve this question











$endgroup$




I have trained a CNN using keras for Image classification with 3 classes. The results are bad and I'm trying to understand what the classifier has learnt and what it has not. It's only giving me an output of 1 class.



I have made changes to the learning rate, activation(relu, sigmoid and softmax for last layer), changed the architecture and the optimizer(SGD and Adam) but the training accuracy is stuck at ~33.33%. It's definitely not a coincidence because I only have 3 classes.



My present architecture is



[Conv -> Relu -Conv -> Relu -> MAxPool] * 3 -> Flatten -> [Dense -> Relu] * 2 -> Dense -> Softmax


My first 2 conv layers have 64 filters of size (3, 3) and the remaining conv layers have 32 filters of the same size.



My Dense layer goes like this 128 units-> 64 units -> 3 units.



I started with a simple model and made it more complex to improve it. But there has been no improvement after any of these changes.



I have used two activations 'relu' and 'sigmoid' for experimental purposes. I'm thinking of only using sigmoid and softmax for the last layer.



I have ~13000 images to train and 1400 for validation. The distribution is almost equal among the 3 classes.



I was using this syntax to add the activation. The summary didn't show any activation layers. And my network wasn't improving.



classifier = Sequential()
classifier.add(Conv2D(32, (3, 3), input_shape = (256, 256, 3), activation = 'relu'))
classifier.add(Conv2D(32, (3, 3), activation = 'relu'))
classifier.add(MaxPooling2D(pool_size = (2,2)))

_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_11 (Conv2D) (None, 254, 254, 32) 896
_________________________________________________________________
conv2d_12 (Conv2D) (None, 252, 252, 32) 9248
_________________________________________________________________
max_pooling2d_6 (MaxPooling2 (None, 84, 84, 32) 0
_________________________________________________________________
conv2d_13 (Conv2D) (None, 82, 82, 32) 9248
_________________________________________________________________
conv2d_14 (Conv2D) (None, 80, 80, 32) 9248
_________________________________________________________________
max_pooling2d_7 (MaxPooling2 (None, 40, 40, 32) 0
_________________________________________________________________
conv2d_15 (Conv2D) (None, 38, 38, 32) 9248
_________________________________________________________________
conv2d_16 (Conv2D) (None, 36, 36, 32) 9248
_________________________________________________________________
max_pooling2d_8 (MaxPooling2 (None, 18, 18, 32) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 10368) 0
_________________________________________________________________
dense_6 (Dense) (None, 128) 1327232
_________________________________________________________________
dense_7 (Dense) (None, 64) 8256
_________________________________________________________________
dense_8 (Dense) (None, 3) 195
=================================================================
Total params: 1,382,819
Trainable params: 1,382,819
Non-trainable params: 0


Edit: updated Network. But when I add Activation as a new Layer the architecture changes. And now my network seems to work.



classifier = Sequential()
classifier.add(Conv2D(64, (3, 3), input_shape = (256, 256, 3))
classifier.add(Activation('relu'))
classifier.add(Conv2D(64, (3, 3))
classifier.add(Activation('relu'))
classifier.add(MaxPooling2D(pool_size = (2,2)))




Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_13 (Conv2D) (None, 254, 254, 32) 896
_________________________________________________________________
activation_1 (Activation) (None, 254, 254, 32) 0
_________________________________________________________________
conv2d_14 (Conv2D) (None, 252, 252, 32) 9248
_________________________________________________________________
activation_2 (Activation) (None, 252, 252, 32) 0
_________________________________________________________________
max_pooling2d_7 (MaxPooling2 (None, 84, 84, 32) 0
_________________________________________________________________
conv2d_15 (Conv2D) (None, 82, 82, 32) 9248
_________________________________________________________________
activation_3 (Activation) (None, 82, 82, 32) 0
_________________________________________________________________
conv2d_16 (Conv2D) (None, 80, 80, 32) 9248
_________________________________________________________________
activation_4 (Activation) (None, 80, 80, 32) 0
_________________________________________________________________
max_pooling2d_8 (MaxPooling2 (None, 40, 40, 32) 0
_________________________________________________________________
flatten_2 (Flatten) (None, 51200) 0
_________________________________________________________________
dense_7 (Dense) (None, 128) 6553728
_________________________________________________________________
activation_5 (Activation) (None, 128) 0
_________________________________________________________________
dense_8 (Dense) (None, 3) 387
_________________________________________________________________
activation_6 (Activation) (None, 3) 0
=================================================================
Total params: 6,582,755
Trainable params: 6,582,755
Non-trainable params: 0


Directory structure:



Training_path
-Label1 Folder
-Label2 Folder
-Label3 Folder


I think that was the problem in my network. That the argument activation wasn't working as expected and no activations were performed on the network input. What I don't understand is that both syntax's are equivalent(according to the documentation) and yet are producing different results.







machine-learning python neural-network keras






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Aug 21 '18 at 8:31







Mohit Motwani

















asked Aug 17 '18 at 6:06









Mohit MotwaniMohit Motwani

294215




294215












  • $begingroup$
    Do you have class imbalance in the input? For example, if there are 100 images, how many belong to each of the 3 classes? Since you've added a dense layer with size 3, model is trained for 3 classes. Add details about % of each output class in the data, that can give us some clue.
    $endgroup$
    – hssay
    Aug 17 '18 at 9:06










  • $begingroup$
    @hssay I'm sure there is no class imbalance. I'm training set has 6080 images and each class has 2000 images at least.
    $endgroup$
    – Mohit Motwani
    Aug 17 '18 at 9:27










  • $begingroup$
    Any other noteworthy symptom? Is the error going down with number of epochs? Is learning rate so low that network is not learning anything? Have you tried training with higher learning rates?
    $endgroup$
    – hssay
    Aug 17 '18 at 9:30










  • $begingroup$
    Actually there is definitely something weird with the model. I have added more conv layers and decreased the receptive field and no. of filters. Changed the optimizer to SGD too. What I have noticed is that the training accuracy gets stucks at 0.3334 after few epochs or right from the beginning(depends on which optimizer or the learning rate I'm using). So yeah, the model is not learning behind 33 percent accuracy. Tried learning rates: 0.01, 0.001, 0.0001
    $endgroup$
    – Mohit Motwani
    Aug 17 '18 at 9:34






  • 1




    $begingroup$
    Okay, the data generator looks fine, the next thing you can do is visualize the predicted classification, here is a link to the blog that contains code to do the same : medium.com/@kylepob61392/… Scroll down to model evaluation. Also make sure once the data in the folders, if they have been appropriately grouped into the subfolders.
    $endgroup$
    – Nischal Hp
    Aug 17 '18 at 11:38


















  • $begingroup$
    Do you have class imbalance in the input? For example, if there are 100 images, how many belong to each of the 3 classes? Since you've added a dense layer with size 3, model is trained for 3 classes. Add details about % of each output class in the data, that can give us some clue.
    $endgroup$
    – hssay
    Aug 17 '18 at 9:06










  • $begingroup$
    @hssay I'm sure there is no class imbalance. I'm training set has 6080 images and each class has 2000 images at least.
    $endgroup$
    – Mohit Motwani
    Aug 17 '18 at 9:27










  • $begingroup$
    Any other noteworthy symptom? Is the error going down with number of epochs? Is learning rate so low that network is not learning anything? Have you tried training with higher learning rates?
    $endgroup$
    – hssay
    Aug 17 '18 at 9:30










  • $begingroup$
    Actually there is definitely something weird with the model. I have added more conv layers and decreased the receptive field and no. of filters. Changed the optimizer to SGD too. What I have noticed is that the training accuracy gets stucks at 0.3334 after few epochs or right from the beginning(depends on which optimizer or the learning rate I'm using). So yeah, the model is not learning behind 33 percent accuracy. Tried learning rates: 0.01, 0.001, 0.0001
    $endgroup$
    – Mohit Motwani
    Aug 17 '18 at 9:34






  • 1




    $begingroup$
    Okay, the data generator looks fine, the next thing you can do is visualize the predicted classification, here is a link to the blog that contains code to do the same : medium.com/@kylepob61392/… Scroll down to model evaluation. Also make sure once the data in the folders, if they have been appropriately grouped into the subfolders.
    $endgroup$
    – Nischal Hp
    Aug 17 '18 at 11:38
















$begingroup$
Do you have class imbalance in the input? For example, if there are 100 images, how many belong to each of the 3 classes? Since you've added a dense layer with size 3, model is trained for 3 classes. Add details about % of each output class in the data, that can give us some clue.
$endgroup$
– hssay
Aug 17 '18 at 9:06




$begingroup$
Do you have class imbalance in the input? For example, if there are 100 images, how many belong to each of the 3 classes? Since you've added a dense layer with size 3, model is trained for 3 classes. Add details about % of each output class in the data, that can give us some clue.
$endgroup$
– hssay
Aug 17 '18 at 9:06












$begingroup$
@hssay I'm sure there is no class imbalance. I'm training set has 6080 images and each class has 2000 images at least.
$endgroup$
– Mohit Motwani
Aug 17 '18 at 9:27




$begingroup$
@hssay I'm sure there is no class imbalance. I'm training set has 6080 images and each class has 2000 images at least.
$endgroup$
– Mohit Motwani
Aug 17 '18 at 9:27












$begingroup$
Any other noteworthy symptom? Is the error going down with number of epochs? Is learning rate so low that network is not learning anything? Have you tried training with higher learning rates?
$endgroup$
– hssay
Aug 17 '18 at 9:30




$begingroup$
Any other noteworthy symptom? Is the error going down with number of epochs? Is learning rate so low that network is not learning anything? Have you tried training with higher learning rates?
$endgroup$
– hssay
Aug 17 '18 at 9:30












$begingroup$
Actually there is definitely something weird with the model. I have added more conv layers and decreased the receptive field and no. of filters. Changed the optimizer to SGD too. What I have noticed is that the training accuracy gets stucks at 0.3334 after few epochs or right from the beginning(depends on which optimizer or the learning rate I'm using). So yeah, the model is not learning behind 33 percent accuracy. Tried learning rates: 0.01, 0.001, 0.0001
$endgroup$
– Mohit Motwani
Aug 17 '18 at 9:34




$begingroup$
Actually there is definitely something weird with the model. I have added more conv layers and decreased the receptive field and no. of filters. Changed the optimizer to SGD too. What I have noticed is that the training accuracy gets stucks at 0.3334 after few epochs or right from the beginning(depends on which optimizer or the learning rate I'm using). So yeah, the model is not learning behind 33 percent accuracy. Tried learning rates: 0.01, 0.001, 0.0001
$endgroup$
– Mohit Motwani
Aug 17 '18 at 9:34




1




1




$begingroup$
Okay, the data generator looks fine, the next thing you can do is visualize the predicted classification, here is a link to the blog that contains code to do the same : medium.com/@kylepob61392/… Scroll down to model evaluation. Also make sure once the data in the folders, if they have been appropriately grouped into the subfolders.
$endgroup$
– Nischal Hp
Aug 17 '18 at 11:38




$begingroup$
Okay, the data generator looks fine, the next thing you can do is visualize the predicted classification, here is a link to the blog that contains code to do the same : medium.com/@kylepob61392/… Scroll down to model evaluation. Also make sure once the data in the folders, if they have been appropriately grouped into the subfolders.
$endgroup$
– Nischal Hp
Aug 17 '18 at 11:38










2 Answers
2






active

oldest

votes


















1












$begingroup$

I faced a similar issue. One-hot encoding the target variable using nputils in Keras, solved the issue of accuracy and validation loss being stuck. Using weights for balancing the target classes further improved performance.



y_train_temp = y_train.copy()
y_val_temp = y_val.copy()
from keras.utils.np.utils import to_categorical
y_train = to_categorical(y_train)
y_val = to_categorical(y_val)





share|improve this answer










New contributor




Sonali Dasgupta is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






$endgroup$





















    0












    $begingroup$

    So there were two problems:




    1. When I tried to add the activations as a parameter Conv2d( activation = 'relu'), my activation layers were not added as displayed in the summary in my question. But by usingadd(Activation('relu'))`, it worked as shown in the question.

    2. But that didn't solve the problem. I used learning rates of 0.01, 0.007, 0.005 which showed good training and validation accuracies for few epochs but plummeted to 33.33 again so probably overshooting. So I used a much lower learning rates of 0.0001, which worked pretty well.






    share|improve this answer









    $endgroup$













      Your Answer





      StackExchange.ifUsing("editor", function () {
      return StackExchange.using("mathjaxEditing", function () {
      StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
      StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
      });
      });
      }, "mathjax-editing");

      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "557"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: false,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: null,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });














      draft saved

      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f37067%2ftraining-accuracy-stuck-in-keras%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      2 Answers
      2






      active

      oldest

      votes








      2 Answers
      2






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      1












      $begingroup$

      I faced a similar issue. One-hot encoding the target variable using nputils in Keras, solved the issue of accuracy and validation loss being stuck. Using weights for balancing the target classes further improved performance.



      y_train_temp = y_train.copy()
      y_val_temp = y_val.copy()
      from keras.utils.np.utils import to_categorical
      y_train = to_categorical(y_train)
      y_val = to_categorical(y_val)





      share|improve this answer










      New contributor




      Sonali Dasgupta is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






      $endgroup$


















        1












        $begingroup$

        I faced a similar issue. One-hot encoding the target variable using nputils in Keras, solved the issue of accuracy and validation loss being stuck. Using weights for balancing the target classes further improved performance.



        y_train_temp = y_train.copy()
        y_val_temp = y_val.copy()
        from keras.utils.np.utils import to_categorical
        y_train = to_categorical(y_train)
        y_val = to_categorical(y_val)





        share|improve this answer










        New contributor




        Sonali Dasgupta is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.






        $endgroup$
















          1












          1








          1





          $begingroup$

          I faced a similar issue. One-hot encoding the target variable using nputils in Keras, solved the issue of accuracy and validation loss being stuck. Using weights for balancing the target classes further improved performance.



          y_train_temp = y_train.copy()
          y_val_temp = y_val.copy()
          from keras.utils.np.utils import to_categorical
          y_train = to_categorical(y_train)
          y_val = to_categorical(y_val)





          share|improve this answer










          New contributor




          Sonali Dasgupta is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.






          $endgroup$



          I faced a similar issue. One-hot encoding the target variable using nputils in Keras, solved the issue of accuracy and validation loss being stuck. Using weights for balancing the target classes further improved performance.



          y_train_temp = y_train.copy()
          y_val_temp = y_val.copy()
          from keras.utils.np.utils import to_categorical
          y_train = to_categorical(y_train)
          y_val = to_categorical(y_val)






          share|improve this answer










          New contributor




          Sonali Dasgupta is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.









          share|improve this answer



          share|improve this answer








          edited yesterday









          Siong Thye Goh

          1,177418




          1,177418






          New contributor




          Sonali Dasgupta is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.









          answered yesterday









          Sonali DasguptaSonali Dasgupta

          111




          111




          New contributor




          Sonali Dasgupta is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.





          New contributor





          Sonali Dasgupta is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.






          Sonali Dasgupta is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.























              0












              $begingroup$

              So there were two problems:




              1. When I tried to add the activations as a parameter Conv2d( activation = 'relu'), my activation layers were not added as displayed in the summary in my question. But by usingadd(Activation('relu'))`, it worked as shown in the question.

              2. But that didn't solve the problem. I used learning rates of 0.01, 0.007, 0.005 which showed good training and validation accuracies for few epochs but plummeted to 33.33 again so probably overshooting. So I used a much lower learning rates of 0.0001, which worked pretty well.






              share|improve this answer









              $endgroup$


















                0












                $begingroup$

                So there were two problems:




                1. When I tried to add the activations as a parameter Conv2d( activation = 'relu'), my activation layers were not added as displayed in the summary in my question. But by usingadd(Activation('relu'))`, it worked as shown in the question.

                2. But that didn't solve the problem. I used learning rates of 0.01, 0.007, 0.005 which showed good training and validation accuracies for few epochs but plummeted to 33.33 again so probably overshooting. So I used a much lower learning rates of 0.0001, which worked pretty well.






                share|improve this answer









                $endgroup$
















                  0












                  0








                  0





                  $begingroup$

                  So there were two problems:




                  1. When I tried to add the activations as a parameter Conv2d( activation = 'relu'), my activation layers were not added as displayed in the summary in my question. But by usingadd(Activation('relu'))`, it worked as shown in the question.

                  2. But that didn't solve the problem. I used learning rates of 0.01, 0.007, 0.005 which showed good training and validation accuracies for few epochs but plummeted to 33.33 again so probably overshooting. So I used a much lower learning rates of 0.0001, which worked pretty well.






                  share|improve this answer









                  $endgroup$



                  So there were two problems:




                  1. When I tried to add the activations as a parameter Conv2d( activation = 'relu'), my activation layers were not added as displayed in the summary in my question. But by usingadd(Activation('relu'))`, it worked as shown in the question.

                  2. But that didn't solve the problem. I used learning rates of 0.01, 0.007, 0.005 which showed good training and validation accuracies for few epochs but plummeted to 33.33 again so probably overshooting. So I used a much lower learning rates of 0.0001, which worked pretty well.







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Aug 29 '18 at 6:31









                  Mohit MotwaniMohit Motwani

                  294215




                  294215






























                      draft saved

                      draft discarded




















































                      Thanks for contributing an answer to Data Science Stack Exchange!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      Use MathJax to format equations. MathJax reference.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f37067%2ftraining-accuracy-stuck-in-keras%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      Callistus I

                      Tabula Rosettana

                      How to label and detect the document text images