0

I am doing a binary classification problem for seizure classification. I split the data into Training, Validation and Test with the following sizes and shapes dataset_X = (154182, 32, 9, 19), dataset_y = (154182, 1).

The unique values for dataset_y are array([0, 1]), array([77127, 77055]) Then the data is split into to become 92508, 30837 and 30837 for Training, Validation and Testing respectively.

The configuration using Categorical_CrossEntropy with a final dense layer with size of 2 and softmax activation function works very well. However, if I tried to used Binary_CrossEntropy with a final dense layer with size of 1 and sigmoid activation function, the training and validation phase reports almost the same results, but when predicting on test dataset, it is totally messed up.

For the softmax model:

The Model:

def create_cnn_model(X_train_shape, nb_classes):
inputs = Input(shape=X_train_shape[1:])

normal1 = BatchNormalization(axis=-1)(inputs)
reshape1 = Lambda(lambda x: keras.backend.expand_dims(x, axis=-1))(normal1)
conv1 = Convolution3D(
    32, (3 ,3, X_train_shape[-1]), data_format = 'channels_last',
    padding='valid', strides=(1,1,1))(reshape1)

reshape2 = Lambda(lambda x: keras.backend.squeeze(x, axis=-2))(conv1)

relu1 = Activation('relu')(reshape2)
pool1 = MaxPooling2D(pool_size=(2, 1), data_format = 'channels_last')(relu1)

normal2 = BatchNormalization(axis=-1)(pool1)

conv2 = Convolution2D(
    64, (3, 3), data_format = 'channels_last',
    padding='valid', strides=(1,1))(normal2)
relu2 = Activation('relu')(conv2)
pool2 = MaxPooling2D(pool_size=(2, 1), data_format = 'channels_last')(relu2)

normal3 = BatchNormalization(axis=-1)(pool2)


conv3 = Convolution2D(
    64, (3, 3), data_format = 'channels_last',
    padding='valid', strides=(1,1))(normal3)
relu3 = Activation('relu')(conv3)

flat = Flatten()(relu3)
drop1 = Dropout(0.5)(flat)
dens1 = Dense(256, activation='relu')(drop1)
drop2 = Dropout(0.5)(dens1)
dens2 = Dense(nb_classes)(drop2)

last = Activation('softmax')(dens2)


model = Model(inputs=inputs, outputs=last)
return model

The functions that create the model and initiates the training

        cnn_model = create_cnn_model(X_train.shape, nb_classes)
        adam = Adam(lr=1e-4, beta_1=0.9, beta_2=0.999, epsilon=1e-08)
        cnn_model.compile(loss='categorical_crossentropy', 
                    optimizer=adam, 
                    metrics=['accuracy', 'Recall''Precision','AUC'])
        Y_train = Y_train.astype('uint8')
        Y_train = np_utils.to_categorical(Y_train, nb_classes)
        Y_val = np_utils.to_categorical(Y_val, nb_classes)
    cnn_model.fit(X_train, Y_train, batch_size=32, epochs=10, validation_data=(X_val,Y_val))

    predictions = cnn_model.predict(X_test, verbose=1)
    y_pred = np_utils.to_categorical(np.argmax(predictions, axis=1), nb_classes)
    y_true = np_utils.to_categorical(Y_test, nb_classes)

    #Converting categorical to numerical
    y_pred_s = y_pred.argmax(1)
    y_true_s = y_true.argmax(1)

    print(np.unique(y_pred_s, return_counts=True))
    print(np.unique(y_true_s, return_counts=True))

    print(y_pred.shape, y_true.shape)
    from sklearn.metrics import f1_score, accuracy_score, recall_score, precision_score, roc_auc_score
    acc_scr = accuracy_score(y_true, y_pred)
    pre_scr = precision_score(y_true, y_pred, average='micro')
    rec_scr = recall_score(y_true, y_pred, average='micro')
    roc_auc_score = roc_auc_score(y_true, y_pred, average='micro')

    f1_test = f1_score(y_true, y_pred, average='weighted')

The training details and testing results after 10 epochs:

Shape: x_train, y_train, X_val, y_val
(92508, 32, 9, 19) (92508, 2) (92508, 32, 9, 19) (30837, 2)
Epoch 1/10
2891/2891 [==============================] - 63s 19ms/step - loss: 0.8718 - accuracy: 0.8860 - recall: 0.8860 - precision: 0.8860 - auc: 0.9474 - val_loss: 0.1635 - val_accuracy: 0.9414 - val_recall: 0.9414 - val_precision: 0.9414 - val_auc: 0.9824
Epoch 2/10
2891/2891 [==============================] - 53s 18ms/step - loss: 0.3728 - accuracy: 0.9361 - recall: 0.9361 - precision: 0.9361 - auc: 0.9813 - val_loss: 0.1891 - val_accuracy: 0.9251 - val_recall: 0.9251 - val_precision: 0.9251 - val_auc: 0.9791
...
Epoch 10/10
2891/2891 [==============================] - 48s 17ms/step - loss: 0.1377 - accuracy: 0.9774 - recall: 0.9774 - precision: 0.9774 - auc: 0.9967 - val_loss: 0.0354 - val_accuracy: 0.9864 - val_recall: 0.9864 - val_precision: 0.9864 - val_auc: 0.9986
964/964 [==============================] - 3s 3ms/step
Shape: X_test, y_test, y_pred
(30837, 32, 9, 19) (30837, 2) (30837, 2)
Accuracy:  0.9854719979245712
Recall:  0.9854719979245712
Precision:  0.9854719979245712
ROC AUC:  0.9854719979245712

For the sigmoid model:

The Model: It is the same model as above but with the following changes:

    dens2 = Dense(1)(drop2)
    last = Activation('sigmoid')(dens2)

The functions that create the model and initiates the training

        cnn_model = create_cnn_model(X_train.shape, nb_classes) #nb_classes is useless here
        adam = Adam(lr=1e-4, beta_1=0.9, beta_2=0.999, epsilon=1e-08)
        cnn_model.compile(loss='binary_crossentropy', 
                    optimizer=adam, 
                    metrics=['accuracy', 'Recall', 'Precision','AUC'])
        cnn_model.fit(X_train, Y_train, batch_size=32, epochs=10, validation_data=(X_val,Y_val))
    predictions = cnn_model.predict(X_test, verbose=1)
    y_pred = np.argmax(predictions)
    y_true = Y_test

    print(y_pred.shape, y_true.shape)
    from sklearn.metrics import f1_score, accuracy_score, recall_score, precision_score, roc_auc_score
    acc_scr = accuracy_score(y_true, y_pred)
    pre_scr = precision_score(y_true, y_pred)
    rec_scr = recall_score(y_true, y_pred)
    roc_auc_score = roc_auc_score(y_true, y_pred)
    f1_test = f1_score(y_true, y_pred, average='weighted')

The training details and testing results after 10 epochs:

Shape: x_train, y_train, X_val, y_val
(92508, 32, 9, 19) (92508, 1) (30837, 32, 9, 19) (30837, 1)
Epoch 1/10
2891/2891 [==============================] - ETA: 0s - loss: 0.0284 - accuracy: 0.9920 - recall: 0.2655 - precision: 0.5381 - auc: 0.9277
2891/2891 [==============================] - 80s 24ms/step - loss: 0.0284 - accuracy: 0.9920 - recall: 0.2655 - precision: 0.5381 - auc: 0.9277 -  val_loss: 0.0156 - val_accuracy: 0.9955 - val_recall: 0.5370 - val_precision: 0.8734 - val_auc: 0.9432 
Epoch 2/10
2891/2891 [==============================] - ETA: 0s - loss: 0.0129 - accuracy: 0.9959 - recall: 0.6269 - precision: 0.8476 - auc: 0.9800
2891/2891 [==============================] - 60s 21ms/step - loss: 0.0129 - accuracy: 0.9959 - recall: 0.6269 - precision: 0.8476 - auc: 0.9800 - val_loss: 0.0079 - val_accuracy: 0.9974 - val_recall: 0.7860 - val_precision: 0.8899 - val_auc: 0.9873 
...
Epoch 10/10
2891/2891 [==============================] - 50s 17ms/step - loss: 0.0853 - accuracy: 0.9660 - recall: 0.9665 - precision: 0.9655 - auc: 0.9952 - val_loss: 0.0865 - val_accuracy: 0.9648 - val_recall: 0.9615 - val_precision: 0.9679 - val_auc: 0.9949
964/964 [==============================] - 3s 3ms/step
Shape: X_test, y_test, y_pred
(30837, 32, 9, 19) (30837, 1) (30837,)
Accuracy:  0.5002432143204592
Recall:  0.0
Precision:  0.0
ROC AUC:  0.5
F1-weighted score: 0.33360360651524557

When printing the y_true and y_pred arrays after running the softmax model, after being converted from categorical to numerical, I get:

y_true:(array([0, 1], dtype=int64), array([15426, 15411], dtype=int64))

y_pred: (array([0, 1], dtype=int64), array([15360, 15477], dtype=int64))

However, when I run the same for the sigmoid model, I get:

y_true: (array([0, 1], dtype=uint8), array([15426, 15411], dtype=int64))

y_pred: (array([0], dtype=int64), array([30837], dtype=int64))

it is apparent that not a single '1' label is predicted. This justifies the scores above. So what does cause this behavior and how to fix it?

Thank you

0 Answers0