What is the best practice (state-of-the-art) to identify ML-based model learning if it is over/under-fitted or good fit (not diagnosing bad fit)?

Question

I'm exploring several ML models for in-sample forecasting task. I'm wondering if there is a straightforward way to identify/detect the good/bad learning.

Classic approach is as it is used for deep learning models, we check plot of loss curve of via history of loss function of train and test sets over epochs. What about ML-based models?

The best something tell us that if learning or generalization was:

Good generalization and print me => good fit
Bad generalization and print me =>
- Bad fit => overfitting
- Bad fit => underfitting

To the best of my knowledge, best practice or state-of-the-art is:

like Deep Learning models we do still plot Learning Curve for Machine Learning models over number of samples in training sets or
check potential over/under-fitting using plot based on scoring="neg_mean_squared_error":

from sklearn.model_selection import cross_val_score
from sklearn.model_selection import LearningCurveDisplay
Evaluate the models using cross-validation
scores = cross_val_score(pipeline, X, y, scoring="neg_mean_squared_error", cv=10)

I'm not sure but it seems this approach has nothing to do with diagnose learning if something is wrong and this only for models Evaluation to compare the models, Plot the true and estimated coefficients.

I would happy if someone has Pythonic solutions to share like new package or libs or wrappers or recent workarounds to give us this insight about ML-based models' learning.

Side note: This question is for finding best practice for only identify learning not looking for diagnosing (bad) generalizing and its treatment solutions like fine-tuing hyper-parameters and GridSerach solutions!

What is the best practice (state-of-the-art) to identify ML-based model learning if it is over/under-fitted or good fit (not diagnosing bad fit)?

Evaluate the models using cross-validation

0 Answers0