Why the test accuracy showing some odd behaviour in comparison to train accuracy?

Question

I am currently training an ANN using Sequential(a class from Keras API within tensorflow), and I am optimizing the model's architecture and came across something I have not seen before.

The graph of the test accuracy seems a bit odd. The graph appears that it is not a smooth curve, but different.

Model

import pandas as pd
from tensorflow import keras
from tensorflow.keras import layers
from sklearn.model_selection import train_test_split 
from tensorflow.keras.models import Sequential 
from tensorflow.keras.layers import Dense
features = data[['Vcom', 'T', 'Vair']]
labels = data[['COP', 'CC']]
features_train, features_test, labels_train, labels_test = train_test_split(features, labels, test_size=0.2, random_state=42)
from tensorflow.keras.layers import  Dropout , BatchNormalization
from tensorflow.keras import regularizers
model = Sequential() 
l2_regularizer = regularizers.l2(0.01)
model.add(Dense(64, activation='relu', input_shape=(3,), kernel_regularizer=l2_regularizer))
model.add(BatchNormalization())
model.add(Dropout(0.6))
model.add(Dense(32, activation='relu', kernel_regularizer=l2_regularizer))
model.add(BatchNormalization())
model.add(Dropout(0.6))
model.add(Dense(32, activation='relu'))
model.add(BatchNormalization())
model.add(Dense(2,))  # 2 output neurons for output1 and output2
model.compile(optimizer='adam', loss='mean_squared_error', metrics=['accuracy'])
model_history=model.fit(features_train, labels_train, epochs=150, batch_size=32, validation_data=(features_test, labels_test))
print(model_history.history.keys())
summarize history for accuracy
plt.plot(model_history.history['accuracy'])
plt.plot(model_history.history['val_accuracy'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()

My question is:

Why the test accuracy showing odd patterns ?
How can I fix this problem?

score 3 · Accepted Answer · answered Nov 26 '23 at 18:44

Why the test accuracy showing odd patterns ?

As @mohottnad mentioned in the comment, it appears your model overfits. It means that it doesn't generalise well and works badly on testing data. I don't know the details of you data, but this strange pattern of accuracy might be explained as follows:

after many epochs your model is so overtrained that it return just one class.
in epoch 100 this class is A and it covers correctly almost 60% of data
in epoch 115 decision threshold changes slightly and the model's guess is B that is correct in +- 1/3 of data
and so on, and so on

It's my hypothesis you can dissect, your situation probably isn't as simple as I described, but it might be very similar. You can also check if you have an imbalanced dataset that might be an issue.

How can I fix this problem?

I don't know how complex your variables are, but you have just 3 input variables and you feed them into 64->32->32 neurons. In many cases, it's too much. I would try with a way simpler model like 16->8 neurons or even smaller. Your demend on model complexity is associated with complexity of relationships in your data. Maybe you need just a logistic regression, who knows? It's up to you to experiment and offset it. The value=0.6 in dropout for this data and model also appears to be too high.

Why the test accuracy showing some odd behaviour in comparison to train accuracy?

summarize history for accuracy

1 Answers1