Training Accuracy stuck in Keras

I have trained a CNN using keras for Image classification with 3 classes. The results are bad and I'm trying to understand what the classifier has learnt and what it has not. It's only giving me an output of 1 class.

I have made changes to the learning rate, activation(relu, sigmoid and softmax for last layer), changed the architecture and the optimizer(SGD and Adam) but the training accuracy is stuck at ~33.33%. It's definitely not a coincidence because I only have 3 classes.

My present architecture is

[Conv -> Relu -Conv -> Relu -> MAxPool] * 3 -> Flatten -> [Dense -> Relu] * 2 -> Dense -> Softmax

My first 2 conv layers have 64 filters of size (3, 3) and the remaining conv layers have 32 filters of the same size.

My Dense layer goes like this 128 units-> 64 units -> 3 units.

I started with a simple model and made it more complex to improve it. But there has been no improvement after any of these changes.

I have used two activations 'relu' and 'sigmoid' for experimental purposes. I'm thinking of only using sigmoid and softmax for the last layer.

I have ~13000 images to train and 1400 for validation. The distribution is almost equal among the 3 classes.

I was using this syntax to add the activation. The summary didn't show any activation layers. And my network wasn't improving.

classifier = Sequential()

    classifier.add(Conv2D(32, (3, 3), input_shape  = (256, 256, 3), activation = 'relu'))

    classifier.add(Conv2D(32, (3, 3), activation = 'relu'))

    classifier.add(MaxPooling2D(pool_size = (2,2)))



    _________________________________________________________________

Layer (type)                 Output Shape              Param #   

=================================================================

conv2d_11 (Conv2D)           (None, 254, 254, 32)      896       

_________________________________________________________________

conv2d_12 (Conv2D)           (None, 252, 252, 32)      9248      

_________________________________________________________________

max_pooling2d_6 (MaxPooling2 (None, 84, 84, 32)        0         

_________________________________________________________________

conv2d_13 (Conv2D)           (None, 82, 82, 32)        9248      

_________________________________________________________________

conv2d_14 (Conv2D)           (None, 80, 80, 32)        9248      

_________________________________________________________________

max_pooling2d_7 (MaxPooling2 (None, 40, 40, 32)        0         

_________________________________________________________________

conv2d_15 (Conv2D)           (None, 38, 38, 32)        9248      

_________________________________________________________________

conv2d_16 (Conv2D)           (None, 36, 36, 32)        9248      

_________________________________________________________________

max_pooling2d_8 (MaxPooling2 (None, 18, 18, 32)        0         

_________________________________________________________________

flatten_1 (Flatten)          (None, 10368)             0         

_________________________________________________________________

dense_6 (Dense)              (None, 128)               1327232   

_________________________________________________________________

dense_7 (Dense)              (None, 64)                8256      

_________________________________________________________________

dense_8 (Dense)              (None, 3)                 195       

=================================================================

Total params: 1,382,819

Trainable params: 1,382,819

Non-trainable params: 0

Edit: updated Network. But when I add Activation as a new Layer the architecture changes. And now my network seems to work.

classifier = Sequential()

classifier.add(Conv2D(64, (3, 3), input_shape  = (256, 256, 3))

classifier.add(Activation('relu'))

classifier.add(Conv2D(64, (3, 3))

classifier.add(Activation('relu'))

classifier.add(MaxPooling2D(pool_size = (2,2)))

Layer (type)                 Output Shape              Param #   

=================================================================

conv2d_13 (Conv2D)           (None, 254, 254, 32)      896       

_________________________________________________________________

activation_1 (Activation)    (None, 254, 254, 32)      0         

_________________________________________________________________

conv2d_14 (Conv2D)           (None, 252, 252, 32)      9248      

_________________________________________________________________

activation_2 (Activation)    (None, 252, 252, 32)      0         

_________________________________________________________________

max_pooling2d_7 (MaxPooling2 (None, 84, 84, 32)        0         

_________________________________________________________________

conv2d_15 (Conv2D)           (None, 82, 82, 32)        9248      

_________________________________________________________________

activation_3 (Activation)    (None, 82, 82, 32)        0         

_________________________________________________________________

conv2d_16 (Conv2D)           (None, 80, 80, 32)        9248      

_________________________________________________________________

activation_4 (Activation)    (None, 80, 80, 32)        0         

_________________________________________________________________

max_pooling2d_8 (MaxPooling2 (None, 40, 40, 32)        0         

_________________________________________________________________

flatten_2 (Flatten)          (None, 51200)             0         

_________________________________________________________________

dense_7 (Dense)              (None, 128)               6553728   

_________________________________________________________________

activation_5 (Activation)    (None, 128)               0         

_________________________________________________________________

dense_8 (Dense)              (None, 3)                 387       

_________________________________________________________________

activation_6 (Activation)    (None, 3)                 0         

=================================================================

Total params: 6,582,755

Trainable params: 6,582,755

Non-trainable params: 0

Directory structure:

Training_path

    -Label1 Folder

    -Label2 Folder

    -Label3 Folder

I think that was the problem in my network. That the argument activation wasn't working as expected and no activations were performed on the network input. What I don't understand is that both syntax's are equivalent(according to the documentation) and yet are producing different results.

edited Aug 21 '18 at 8:31

asked Aug 17 '18 at 6:06

Mohit Motwani

294215

$begingroup$
Do you have class imbalance in the input? For example, if there are 100 images, how many belong to each of the 3 classes? Since you've added a dense layer with size 3, model is trained for 3 classes. Add details about % of each output class in the data, that can give us some clue.
$endgroup$
– hssay
Aug 17 '18 at 9:06

$begingroup$
@hssay I'm sure there is no class imbalance. I'm training set has 6080 images and each class has 2000 images at least.
$endgroup$
– Mohit Motwani
Aug 17 '18 at 9:27

$begingroup$
Any other noteworthy symptom? Is the error going down with number of epochs? Is learning rate so low that network is not learning anything? Have you tried training with higher learning rates?
$endgroup$
– hssay
Aug 17 '18 at 9:30

$begingroup$
Actually there is definitely something weird with the model. I have added more conv layers and decreased the receptive field and no. of filters. Changed the optimizer to SGD too. What I have noticed is that the training accuracy gets stucks at 0.3334 after few epochs or right from the beginning(depends on which optimizer or the learning rate I'm using). So yeah, the model is not learning behind 33 percent accuracy. Tried learning rates: 0.01, 0.001, 0.0001
$endgroup$
– Mohit Motwani
Aug 17 '18 at 9:34

1

$begingroup$
Okay, the data generator looks fine, the next thing you can do is visualize the predicted classification, here is a link to the blog that contains code to do the same : medium.com/@kylepob61392/… Scroll down to model evaluation. Also make sure once the data in the folders, if they have been appropriately grouped into the subfolders.
$endgroup$
– Nischal Hp
Aug 17 '18 at 11:38

|
show 9 more comments

My present architecture is

[Conv -> Relu -Conv -> Relu -> MAxPool] * 3 -> Flatten -> [Dense -> Relu] * 2 -> Dense -> Softmax

My first 2 conv layers have 64 filters of size (3, 3) and the remaining conv layers have 32 filters of the same size.

My Dense layer goes like this 128 units-> 64 units -> 3 units.

I started with a simple model and made it more complex to improve it. But there has been no improvement after any of these changes.

I have used two activations 'relu' and 'sigmoid' for experimental purposes. I'm thinking of only using sigmoid and softmax for the last layer.

I have ~13000 images to train and 1400 for validation. The distribution is almost equal among the 3 classes.

I was using this syntax to add the activation. The summary didn't show any activation layers. And my network wasn't improving.

classifier = Sequential()

    classifier.add(Conv2D(32, (3, 3), input_shape  = (256, 256, 3), activation = 'relu'))

    classifier.add(Conv2D(32, (3, 3), activation = 'relu'))

    classifier.add(MaxPooling2D(pool_size = (2,2)))



    _________________________________________________________________

Layer (type)                 Output Shape              Param #   

=================================================================

conv2d_11 (Conv2D)           (None, 254, 254, 32)      896       

_________________________________________________________________

conv2d_12 (Conv2D)           (None, 252, 252, 32)      9248      

_________________________________________________________________

max_pooling2d_6 (MaxPooling2 (None, 84, 84, 32)        0         

_________________________________________________________________

conv2d_13 (Conv2D)           (None, 82, 82, 32)        9248      

_________________________________________________________________

conv2d_14 (Conv2D)           (None, 80, 80, 32)        9248      

_________________________________________________________________

max_pooling2d_7 (MaxPooling2 (None, 40, 40, 32)        0         

_________________________________________________________________

conv2d_15 (Conv2D)           (None, 38, 38, 32)        9248      

_________________________________________________________________

conv2d_16 (Conv2D)           (None, 36, 36, 32)        9248      

_________________________________________________________________

max_pooling2d_8 (MaxPooling2 (None, 18, 18, 32)        0         

_________________________________________________________________

flatten_1 (Flatten)          (None, 10368)             0         

_________________________________________________________________

dense_6 (Dense)              (None, 128)               1327232   

_________________________________________________________________

dense_7 (Dense)              (None, 64)                8256      

_________________________________________________________________

dense_8 (Dense)              (None, 3)                 195       

=================================================================

Total params: 1,382,819

Trainable params: 1,382,819

Non-trainable params: 0

Edit: updated Network. But when I add Activation as a new Layer the architecture changes. And now my network seems to work.

classifier = Sequential()

classifier.add(Conv2D(64, (3, 3), input_shape  = (256, 256, 3))

classifier.add(Activation('relu'))

classifier.add(Conv2D(64, (3, 3))

classifier.add(Activation('relu'))

classifier.add(MaxPooling2D(pool_size = (2,2)))

Layer (type)                 Output Shape              Param #   

=================================================================

conv2d_13 (Conv2D)           (None, 254, 254, 32)      896       

_________________________________________________________________

activation_1 (Activation)    (None, 254, 254, 32)      0         

_________________________________________________________________

conv2d_14 (Conv2D)           (None, 252, 252, 32)      9248      

_________________________________________________________________

activation_2 (Activation)    (None, 252, 252, 32)      0         

_________________________________________________________________

max_pooling2d_7 (MaxPooling2 (None, 84, 84, 32)        0         

_________________________________________________________________

conv2d_15 (Conv2D)           (None, 82, 82, 32)        9248      

_________________________________________________________________

activation_3 (Activation)    (None, 82, 82, 32)        0         

_________________________________________________________________

conv2d_16 (Conv2D)           (None, 80, 80, 32)        9248      

_________________________________________________________________

activation_4 (Activation)    (None, 80, 80, 32)        0         

_________________________________________________________________

max_pooling2d_8 (MaxPooling2 (None, 40, 40, 32)        0         

_________________________________________________________________

flatten_2 (Flatten)          (None, 51200)             0         

_________________________________________________________________

dense_7 (Dense)              (None, 128)               6553728   

_________________________________________________________________

activation_5 (Activation)    (None, 128)               0         

_________________________________________________________________

dense_8 (Dense)              (None, 3)                 387       

_________________________________________________________________

activation_6 (Activation)    (None, 3)                 0         

=================================================================

Total params: 6,582,755

Trainable params: 6,582,755

Non-trainable params: 0

Directory structure:

Training_path

    -Label1 Folder

    -Label2 Folder

    -Label3 Folder

edited Aug 21 '18 at 8:31

asked Aug 17 '18 at 6:06

Mohit Motwani

294215

$begingroup$
Do you have class imbalance in the input? For example, if there are 100 images, how many belong to each of the 3 classes? Since you've added a dense layer with size 3, model is trained for 3 classes. Add details about % of each output class in the data, that can give us some clue.
$endgroup$
– hssay
Aug 17 '18 at 9:06

$begingroup$
@hssay I'm sure there is no class imbalance. I'm training set has 6080 images and each class has 2000 images at least.
$endgroup$
– Mohit Motwani
Aug 17 '18 at 9:27

$begingroup$
Any other noteworthy symptom? Is the error going down with number of epochs? Is learning rate so low that network is not learning anything? Have you tried training with higher learning rates?
$endgroup$
– hssay
Aug 17 '18 at 9:30

$begingroup$
Actually there is definitely something weird with the model. I have added more conv layers and decreased the receptive field and no. of filters. Changed the optimizer to SGD too. What I have noticed is that the training accuracy gets stucks at 0.3334 after few epochs or right from the beginning(depends on which optimizer or the learning rate I'm using). So yeah, the model is not learning behind 33 percent accuracy. Tried learning rates: 0.01, 0.001, 0.0001
$endgroup$
– Mohit Motwani
Aug 17 '18 at 9:34

1

$begingroup$
Okay, the data generator looks fine, the next thing you can do is visualize the predicted classification, here is a link to the blog that contains code to do the same : medium.com/@kylepob61392/… Scroll down to model evaluation. Also make sure once the data in the folders, if they have been appropriately grouped into the subfolders.
$endgroup$
– Nischal Hp
Aug 17 '18 at 11:38

|
show 9 more comments

My present architecture is

[Conv -> Relu -Conv -> Relu -> MAxPool] * 3 -> Flatten -> [Dense -> Relu] * 2 -> Dense -> Softmax

My first 2 conv layers have 64 filters of size (3, 3) and the remaining conv layers have 32 filters of the same size.

My Dense layer goes like this 128 units-> 64 units -> 3 units.

I started with a simple model and made it more complex to improve it. But there has been no improvement after any of these changes.

I have used two activations 'relu' and 'sigmoid' for experimental purposes. I'm thinking of only using sigmoid and softmax for the last layer.

I have ~13000 images to train and 1400 for validation. The distribution is almost equal among the 3 classes.

I was using this syntax to add the activation. The summary didn't show any activation layers. And my network wasn't improving.

classifier = Sequential()

    classifier.add(Conv2D(32, (3, 3), input_shape  = (256, 256, 3), activation = 'relu'))

    classifier.add(Conv2D(32, (3, 3), activation = 'relu'))

    classifier.add(MaxPooling2D(pool_size = (2,2)))



    _________________________________________________________________

Layer (type)                 Output Shape              Param #   

=================================================================

conv2d_11 (Conv2D)           (None, 254, 254, 32)      896       

_________________________________________________________________

conv2d_12 (Conv2D)           (None, 252, 252, 32)      9248      

_________________________________________________________________

max_pooling2d_6 (MaxPooling2 (None, 84, 84, 32)        0         

_________________________________________________________________

conv2d_13 (Conv2D)           (None, 82, 82, 32)        9248      

_________________________________________________________________

conv2d_14 (Conv2D)           (None, 80, 80, 32)        9248      

_________________________________________________________________

max_pooling2d_7 (MaxPooling2 (None, 40, 40, 32)        0         

_________________________________________________________________

conv2d_15 (Conv2D)           (None, 38, 38, 32)        9248      

_________________________________________________________________

conv2d_16 (Conv2D)           (None, 36, 36, 32)        9248      

_________________________________________________________________

max_pooling2d_8 (MaxPooling2 (None, 18, 18, 32)        0         

_________________________________________________________________

flatten_1 (Flatten)          (None, 10368)             0         

_________________________________________________________________

dense_6 (Dense)              (None, 128)               1327232   

_________________________________________________________________

dense_7 (Dense)              (None, 64)                8256      

_________________________________________________________________

dense_8 (Dense)              (None, 3)                 195       

=================================================================

Total params: 1,382,819

Trainable params: 1,382,819

Non-trainable params: 0

Edit: updated Network. But when I add Activation as a new Layer the architecture changes. And now my network seems to work.

classifier = Sequential()

classifier.add(Conv2D(64, (3, 3), input_shape  = (256, 256, 3))

classifier.add(Activation('relu'))

classifier.add(Conv2D(64, (3, 3))

classifier.add(Activation('relu'))

classifier.add(MaxPooling2D(pool_size = (2,2)))

Layer (type)                 Output Shape              Param #   

=================================================================

conv2d_13 (Conv2D)           (None, 254, 254, 32)      896       

_________________________________________________________________

activation_1 (Activation)    (None, 254, 254, 32)      0         

_________________________________________________________________

conv2d_14 (Conv2D)           (None, 252, 252, 32)      9248      

_________________________________________________________________

activation_2 (Activation)    (None, 252, 252, 32)      0         

_________________________________________________________________

max_pooling2d_7 (MaxPooling2 (None, 84, 84, 32)        0         

_________________________________________________________________

conv2d_15 (Conv2D)           (None, 82, 82, 32)        9248      

_________________________________________________________________

activation_3 (Activation)    (None, 82, 82, 32)        0         

_________________________________________________________________

conv2d_16 (Conv2D)           (None, 80, 80, 32)        9248      

_________________________________________________________________

activation_4 (Activation)    (None, 80, 80, 32)        0         

_________________________________________________________________

max_pooling2d_8 (MaxPooling2 (None, 40, 40, 32)        0         

_________________________________________________________________

flatten_2 (Flatten)          (None, 51200)             0         

_________________________________________________________________

dense_7 (Dense)              (None, 128)               6553728   

_________________________________________________________________

activation_5 (Activation)    (None, 128)               0         

_________________________________________________________________

dense_8 (Dense)              (None, 3)                 387       

_________________________________________________________________

activation_6 (Activation)    (None, 3)                 0         

=================================================================

Total params: 6,582,755

Trainable params: 6,582,755

Non-trainable params: 0

Directory structure:

Training_path

    -Label1 Folder

    -Label2 Folder

    -Label3 Folder

edited Aug 21 '18 at 8:31

asked Aug 17 '18 at 6:06

Mohit Motwani

294215

My present architecture is

[Conv -> Relu -Conv -> Relu -> MAxPool] * 3 -> Flatten -> [Dense -> Relu] * 2 -> Dense -> Softmax

My first 2 conv layers have 64 filters of size (3, 3) and the remaining conv layers have 32 filters of the same size.

My Dense layer goes like this 128 units-> 64 units -> 3 units.

I started with a simple model and made it more complex to improve it. But there has been no improvement after any of these changes.

I have used two activations 'relu' and 'sigmoid' for experimental purposes. I'm thinking of only using sigmoid and softmax for the last layer.

I have ~13000 images to train and 1400 for validation. The distribution is almost equal among the 3 classes.

I was using this syntax to add the activation. The summary didn't show any activation layers. And my network wasn't improving.

classifier = Sequential()

    classifier.add(Conv2D(32, (3, 3), input_shape  = (256, 256, 3), activation = 'relu'))

    classifier.add(Conv2D(32, (3, 3), activation = 'relu'))

    classifier.add(MaxPooling2D(pool_size = (2,2)))



    _________________________________________________________________

Layer (type)                 Output Shape              Param #   

=================================================================

conv2d_11 (Conv2D)           (None, 254, 254, 32)      896       

_________________________________________________________________

conv2d_12 (Conv2D)           (None, 252, 252, 32)      9248      

_________________________________________________________________

max_pooling2d_6 (MaxPooling2 (None, 84, 84, 32)        0         

_________________________________________________________________

conv2d_13 (Conv2D)           (None, 82, 82, 32)        9248      

_________________________________________________________________

conv2d_14 (Conv2D)           (None, 80, 80, 32)        9248      

_________________________________________________________________

max_pooling2d_7 (MaxPooling2 (None, 40, 40, 32)        0         

_________________________________________________________________

conv2d_15 (Conv2D)           (None, 38, 38, 32)        9248      

_________________________________________________________________

conv2d_16 (Conv2D)           (None, 36, 36, 32)        9248      

_________________________________________________________________

max_pooling2d_8 (MaxPooling2 (None, 18, 18, 32)        0         

_________________________________________________________________

flatten_1 (Flatten)          (None, 10368)             0         

_________________________________________________________________

dense_6 (Dense)              (None, 128)               1327232   

_________________________________________________________________

dense_7 (Dense)              (None, 64)                8256      

_________________________________________________________________

dense_8 (Dense)              (None, 3)                 195       

=================================================================

Total params: 1,382,819

Trainable params: 1,382,819

Non-trainable params: 0

Edit: updated Network. But when I add Activation as a new Layer the architecture changes. And now my network seems to work.

classifier = Sequential()

classifier.add(Conv2D(64, (3, 3), input_shape  = (256, 256, 3))

classifier.add(Activation('relu'))

classifier.add(Conv2D(64, (3, 3))

classifier.add(Activation('relu'))

classifier.add(MaxPooling2D(pool_size = (2,2)))

Layer (type)                 Output Shape              Param #   

=================================================================

conv2d_13 (Conv2D)           (None, 254, 254, 32)      896       

_________________________________________________________________

activation_1 (Activation)    (None, 254, 254, 32)      0         

_________________________________________________________________

conv2d_14 (Conv2D)           (None, 252, 252, 32)      9248      

_________________________________________________________________

activation_2 (Activation)    (None, 252, 252, 32)      0         

_________________________________________________________________

max_pooling2d_7 (MaxPooling2 (None, 84, 84, 32)        0         

_________________________________________________________________

conv2d_15 (Conv2D)           (None, 82, 82, 32)        9248      

_________________________________________________________________

activation_3 (Activation)    (None, 82, 82, 32)        0         

_________________________________________________________________

conv2d_16 (Conv2D)           (None, 80, 80, 32)        9248      

_________________________________________________________________

activation_4 (Activation)    (None, 80, 80, 32)        0         

_________________________________________________________________

max_pooling2d_8 (MaxPooling2 (None, 40, 40, 32)        0         

_________________________________________________________________

flatten_2 (Flatten)          (None, 51200)             0         

_________________________________________________________________

dense_7 (Dense)              (None, 128)               6553728   

_________________________________________________________________

activation_5 (Activation)    (None, 128)               0         

_________________________________________________________________

dense_8 (Dense)              (None, 3)                 387       

_________________________________________________________________

activation_6 (Activation)    (None, 3)                 0         

=================================================================

Total params: 6,582,755

Trainable params: 6,582,755

Non-trainable params: 0

Directory structure:

Training_path

    -Label1 Folder

    -Label2 Folder

    -Label3 Folder

machine-learning python neural-network keras

edited Aug 21 '18 at 8:31

asked Aug 17 '18 at 6:06

Mohit Motwani

294215

edited Aug 21 '18 at 8:31

asked Aug 17 '18 at 6:06

Mohit Motwani

294215

edited Aug 21 '18 at 8:31

asked Aug 17 '18 at 6:06

Mohit Motwani

294215

asked Aug 17 '18 at 6:06

Mohit Motwani

294215

asked Aug 17 '18 at 6:06

Mohit Motwani

294215

$begingroup$
Do you have class imbalance in the input? For example, if there are 100 images, how many belong to each of the 3 classes? Since you've added a dense layer with size 3, model is trained for 3 classes. Add details about % of each output class in the data, that can give us some clue.
$endgroup$
– hssay
Aug 17 '18 at 9:06

$begingroup$
@hssay I'm sure there is no class imbalance. I'm training set has 6080 images and each class has 2000 images at least.
$endgroup$
– Mohit Motwani
Aug 17 '18 at 9:27

$begingroup$
Any other noteworthy symptom? Is the error going down with number of epochs? Is learning rate so low that network is not learning anything? Have you tried training with higher learning rates?
$endgroup$
– hssay
Aug 17 '18 at 9:30

$begingroup$
Actually there is definitely something weird with the model. I have added more conv layers and decreased the receptive field and no. of filters. Changed the optimizer to SGD too. What I have noticed is that the training accuracy gets stucks at 0.3334 after few epochs or right from the beginning(depends on which optimizer or the learning rate I'm using). So yeah, the model is not learning behind 33 percent accuracy. Tried learning rates: 0.01, 0.001, 0.0001
$endgroup$
– Mohit Motwani
Aug 17 '18 at 9:34

1

$begingroup$
Okay, the data generator looks fine, the next thing you can do is visualize the predicted classification, here is a link to the blog that contains code to do the same : medium.com/@kylepob61392/… Scroll down to model evaluation. Also make sure once the data in the folders, if they have been appropriately grouped into the subfolders.
$endgroup$
– Nischal Hp
Aug 17 '18 at 11:38

|
show 9 more comments

$begingroup$
Do you have class imbalance in the input? For example, if there are 100 images, how many belong to each of the 3 classes? Since you've added a dense layer with size 3, model is trained for 3 classes. Add details about % of each output class in the data, that can give us some clue.
$endgroup$
– hssay
Aug 17 '18 at 9:06

$begingroup$
@hssay I'm sure there is no class imbalance. I'm training set has 6080 images and each class has 2000 images at least.
$endgroup$
– Mohit Motwani
Aug 17 '18 at 9:27

$begingroup$
Any other noteworthy symptom? Is the error going down with number of epochs? Is learning rate so low that network is not learning anything? Have you tried training with higher learning rates?
$endgroup$
– hssay
Aug 17 '18 at 9:30

$begingroup$
Actually there is definitely something weird with the model. I have added more conv layers and decreased the receptive field and no. of filters. Changed the optimizer to SGD too. What I have noticed is that the training accuracy gets stucks at 0.3334 after few epochs or right from the beginning(depends on which optimizer or the learning rate I'm using). So yeah, the model is not learning behind 33 percent accuracy. Tried learning rates: 0.01, 0.001, 0.0001
$endgroup$
– Mohit Motwani
Aug 17 '18 at 9:34

1

$begingroup$
Okay, the data generator looks fine, the next thing you can do is visualize the predicted classification, here is a link to the blog that contains code to do the same : medium.com/@kylepob61392/… Scroll down to model evaluation. Also make sure once the data in the folders, if they have been appropriately grouped into the subfolders.
$endgroup$
– Nischal Hp
Aug 17 '18 at 11:38

Do you have class imbalance in the input? For example, if there are 100 images, how many belong to each of the 3 classes? Since you've added a dense layer with size 3, model is trained for 3 classes. Add details about % of each output class in the data, that can give us some clue.

– hssay
Aug 17 '18 at 9:06

@hssay I'm sure there is no class imbalance. I'm training set has 6080 images and each class has 2000 images at least.

– Mohit Motwani
Aug 17 '18 at 9:27

Any other noteworthy symptom? Is the error going down with number of epochs? Is learning rate so low that network is not learning anything? Have you tried training with higher learning rates?

– hssay
Aug 17 '18 at 9:30

Actually there is definitely something weird with the model. I have added more conv layers and decreased the receptive field and no. of filters. Changed the optimizer to SGD too. What I have noticed is that the training accuracy gets stucks at 0.3334 after few epochs or right from the beginning(depends on which optimizer or the learning rate I'm using). So yeah, the model is not learning behind 33 percent accuracy. Tried learning rates: 0.01, 0.001, 0.0001

– Mohit Motwani
Aug 17 '18 at 9:34

Okay, the data generator looks fine, the next thing you can do is visualize the predicted classification, here is a link to the blog that contains code to do the same : medium.com/@kylepob61392/… Scroll down to model evaluation. Also make sure once the data in the folders, if they have been appropriately grouped into the subfolders.

– Nischal Hp
Aug 17 '18 at 11:38

|
show 9 more comments

2 Answers
2

active

oldest

votes

I faced a similar issue. One-hot encoding the target variable using nputils in Keras, solved the issue of accuracy and validation loss being stuck. Using weights for balancing the target classes further improved performance.

y_train_temp = y_train.copy()

y_val_temp = y_val.copy()

from keras.utils.np.utils import to_categorical

y_train = to_categorical(y_train)

y_val = to_categorical(y_val)

edited yesterday

Siong Thye Goh

1,177418

answered yesterday

Sonali Dasgupta

111

New contributor

add a comment |

So there were two problems:

When I tried to add the activations as a parameter Conv2d( activation = 'relu'), my activation layers were not added as displayed in the summary in my question. But by usingadd(Activation('relu'))`, it worked as shown in the question.

But that didn't solve the problem. I used learning rates of 0.01, 0.007, 0.005 which showed good training and validation accuracies for few epochs but plummeted to 33.33 again so probably overshooting. So I used a much lower learning rates of 0.0001, which worked pretty well.

answered Aug 29 '18 at 6:31

Mohit Motwani

294215

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "557"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f37067%2ftraining-accuracy-stuck-in-keras%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

y_train_temp = y_train.copy()

y_val_temp = y_val.copy()

from keras.utils.np.utils import to_categorical

y_train = to_categorical(y_train)

y_val = to_categorical(y_val)

edited yesterday

Siong Thye Goh

1,177418

answered yesterday

Sonali Dasgupta

111

New contributor

add a comment |

y_train_temp = y_train.copy()

y_val_temp = y_val.copy()

from keras.utils.np.utils import to_categorical

y_train = to_categorical(y_train)

y_val = to_categorical(y_val)

edited yesterday

Siong Thye Goh

1,177418

answered yesterday

Sonali Dasgupta

111

New contributor

add a comment |

y_train_temp = y_train.copy()

y_val_temp = y_val.copy()

from keras.utils.np.utils import to_categorical

y_train = to_categorical(y_train)

y_val = to_categorical(y_val)

edited yesterday

Siong Thye Goh

1,177418

answered yesterday

Sonali Dasgupta

111

New contributor

y_train_temp = y_train.copy()

y_val_temp = y_val.copy()

from keras.utils.np.utils import to_categorical

y_train = to_categorical(y_train)

y_val = to_categorical(y_val)

edited yesterday

Siong Thye Goh

1,177418

answered yesterday

Sonali Dasgupta

111

New contributor

edited yesterday

Siong Thye Goh

1,177418

edited yesterday

Siong Thye Goh

1,177418

edited yesterday

Siong Thye Goh

1,177418

answered yesterday

Sonali Dasgupta

111

New contributor

answered yesterday

Sonali Dasgupta

111

answered yesterday

Sonali Dasgupta

111

New contributor

Sonali Dasgupta is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

add a comment |

So there were two problems:

When I tried to add the activations as a parameter Conv2d( activation = 'relu'), my activation layers were not added as displayed in the summary in my question. But by usingadd(Activation('relu'))`, it worked as shown in the question.

But that didn't solve the problem. I used learning rates of 0.01, 0.007, 0.005 which showed good training and validation accuracies for few epochs but plummeted to 33.33 again so probably overshooting. So I used a much lower learning rates of 0.0001, which worked pretty well.

answered Aug 29 '18 at 6:31

Mohit Motwani

294215

add a comment |

So there were two problems:

When I tried to add the activations as a parameter Conv2d( activation = 'relu'), my activation layers were not added as displayed in the summary in my question. But by usingadd(Activation('relu'))`, it worked as shown in the question.

But that didn't solve the problem. I used learning rates of 0.01, 0.007, 0.005 which showed good training and validation accuracies for few epochs but plummeted to 33.33 again so probably overshooting. So I used a much lower learning rates of 0.0001, which worked pretty well.

answered Aug 29 '18 at 6:31

Mohit Motwani

294215

add a comment |

So there were two problems:

When I tried to add the activations as a parameter Conv2d( activation = 'relu'), my activation layers were not added as displayed in the summary in my question. But by usingadd(Activation('relu'))`, it worked as shown in the question.

But that didn't solve the problem. I used learning rates of 0.01, 0.007, 0.005 which showed good training and validation accuracies for few epochs but plummeted to 33.33 again so probably overshooting. So I used a much lower learning rates of 0.0001, which worked pretty well.

answered Aug 29 '18 at 6:31

Mohit Motwani

294215

So there were two problems:

When I tried to add the activations as a parameter Conv2d( activation = 'relu'), my activation layers were not added as displayed in the summary in my question. But by usingadd(Activation('relu'))`, it worked as shown in the question.

But that didn't solve the problem. I used learning rates of 0.01, 0.007, 0.005 which showed good training and validation accuracies for few epochs but plummeted to 33.33 again so probably overshooting. So I used a much lower learning rates of 0.0001, which worked pretty well.

answered Aug 29 '18 at 6:31

Mohit Motwani

294215

answered Aug 29 '18 at 6:31

Mohit Motwani

294215

answered Aug 29 '18 at 6:31

Mohit Motwani

294215

answered Aug 29 '18 at 6:31

Mohit Motwani

294215

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Data Science Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Htydjtk