1

So I want to stop the cnn when a custom (not implemented in keras) logged metric is not improving with a patience of 5 (I chose macro f1 score) and here's what I did:

Created a callback to log the macro f1 score on epoch end and an early stopping:

early_stopping = EarlyStopping(monitor='val_macro_f1', patience=5, restore_best_weights=True)
macro_f1_callback = MacroF1Callback(x_valid_combined_tfidf, y_valid)

And here is the (simplified) output of the fitting:

Epoch 1/20
Validation Macro F1 Score: 0.3983

Epoch 2/20 Validation Macro F1 Score: 0.3369

Epoch 3/20 Validation Macro F1 Score: 0.4057

Epoch 4/20 Validation Macro F1 Score: 0.3947

Epoch 5/20 Validation Macro F1 Score: 0.3761

Epoch 6/20 Validation Macro F1 Score: 0.3918

Epoch 7/20 Validation Macro F1 Score: 0.4147 <keras.src.callbacks.History at 0x4cae76210>

And after predicting again on the validation data, it seems that the early stopping chose the better weights to be from...epoch 2...

F1 score: 0.33687923314086654

All this doesn't make sense, because ok, epoch 2 started decreasing the metric since its lower then epoch 1, and it ends in epoch 7, but epoch 7 has a metric even better the epoch 1

Can anyone help me with this? Maybe I'm doing something wrong. Also please tell me if you want to paste some more code here.

Thanks!

giza2001s
  • 11
  • 1

1 Answers1

1

In my view, all happened as it was supposed to. EarlyStopping in the 7th epoch concluded: "The lowest metric was seen 5 epochs ago and the model stopped improving, so let's stop the training" - exactly how you defined it.

Your problem is that you want to decrease and not increase your custom metric. However, if your goal is to increase some function, it's not a loss function. If I understood it correctly, you can try putting a minus before the F1 score to let the model decrease its negative equivalent or give a try to other similar loss functions like categorical cross-entropy.