I have been using Prophet model to for demand forecasting. I have a general question about how I should be using the fitted model input to cross validation function.
model = Prophet()
model.fit(df) #fit model on entire data or the data only till initial??
initial = '730 days' # Initial training period
period = '180 days' # Period between each cutoff point
horizon = '365 days' # Forecast horizon
Run cross-validation
df_cv = cross_validation(model, initial=initial, period=period, horizon=horizon)
My question is the model that is passed in this function fitted on the entire time series or only the initial part of the time series? If it is on the entire data, is it not a case of data leakage? Since model would already have seen the remaining validation period?
I am using cross validation to come up with best hyperparameters and to compare different models. So it is crucial to ensure that there is no data leakage.