9

I am trying to classify images and assign them label 1 or 0. (Skin cancer or not).

I am aware of the three main issues regarding having the same output in every input.

I did not split the set and I'm just trying to apply the CNN on the train set, I know it doesn't make sense but it's just to verify how it's working. (Predicting on the unlabeled data gives exact same probability)

I have verified the three main points:

1: Scaling the date (both image size and pixel intensity values) 2: Taking a low learning rate 3: I only tried with small epochs 6 at most because of the computation time, is it worth it to let it run one day just to see results with more epochs ?

Anyway I can't understand how a bad training could lead the network to give same class probability every time ?

I tried the on batch options etc.. doesn't change anything.

Accuracy is very low as this kind of classification is not really suited for CNNs but this shouldn't explain the weird result though.

Here is my output : Output of predict

Here are different parts of the program: Model : CNN

Resizing : enter image description here

Thanks for help and sorry for the ugly screenshots.

Florian Laborde
  • 115
  • 1
  • 1
  • 7

3 Answers3

11

When all the predictions are giving exact the same value you know that your model is not learning thus something is wrong!

In your case the problem is having the last dense layer with the softmax AND the sigmoid activation.

model.add(keras.layers.Dense(1, activation=tf.nn.softmax))
model.add(keras.layers.Activation('sigmoid'))

This is creating a conflict where the softmax is outputting a 1 (since there is only one node) and the sigmoid takes this 1 and computing its sigmoid value gives:

1/(1+exp(-1)) = 0.731058

And there is our friend!

To solve this, you just need to remove the last activation layer, and change the softmax for a sigmoid since your output is binary:

model = keras.Sequential()
model.add(keras.layers.Conv2D(16, [3,3], activation='relu', padding='same'))
model.add(keras.layers.Conv2D(32, [3,3], activation='relu', padding='same'))
model.add(keras.layers.Conv2D(64, [3,3], activation='relu', padding='same'))

model.add(keras.layers.BatchNormalization())
model.add(keras.layers.Dropout(0.15))
model.add(keras.layers.Activation('relu'))
model.add(keras.layers.Flatten())
model.add(keras.layers.Dense(50))
model.add(keras.layers.Dense(1, activation=tf.nn.sigmoid))
#model.add(keras.layers.Activation('sigmoid'))

This should work!

TitoOrt
  • 1,892
  • 14
  • 23
2

It could also be that the problem is very hard to learn. I've had this and actually after 6 hours of identical outputs in each batch (which happens because the 'average' answer is the easiest to minimize loss), the network finally started learning:

enter image description here

Some things I plan on doing to make the learning happen earlier:

  1. changes in learning rate
  2. using features from earlier in the network
DankMasterDan
  • 303
  • 3
  • 10
0

I had faced a similar problem recently and wanted to share the other methods that worked for me:

i) Using LeakyReLU instead of ReLU.

ii) Using smaller learning rate.

iii) Transformation: Transform the input variables (the continuous ones) using log, exp or polynomial transformations.

Hope it helps.