I am trying to do hypertuning using Optuna. The dataset is the MovieLense (1M). In one script I have Lasso, Ridge and Knn. Optuna is working fine for the Lasso and Ridge but getting stuck for the Knn.
You can see the trials for the Ridge model tuning was done at 2021-07-22 18:33:53. Later a new study was created for the Knn at 2021-07-22 18:33:53. Now (at the time of posting) it is 2021-07-23 11:07:48 but there was no trial for the Knn.
^[[32m[I 2021-07-22 18:33:53,959]^[[0m Trial 199 finished with value: -1.1917496039282074 and parameters: {'alpha': 3.553292157377711e-07, 'solver': 'sag', 'normalize': False}. Best is trial 71 with value: -1.1917485424789929.^[[0m
^[[32m[I 2021-07-22 18:33:53,961]^[[0m A new study created in memory with name: no-name-208652b3-68ec-4464-a2ae-5afefa9bf133^[[0m
The same thing is happening with the SVR model (you can see optuna stuck after 84 number trial at 2021-07-23 05:13:40)
^[[32m[I 2021-07-23 05:13:37,907]^[[0m Trial 83 finished with value: -1.593471166487258 and parameters: {'C': 834.9834466420455, 'epsilon': 99.19181748590665, 'kernel': 'linear', 'norm': 'minmax'}. Best is trial 61 with value: -1.553044709891868.^[[0m
^[[32m[I 2021-07-23 05:13:40,261]^[[0m Trial 84 finished with value: -1.593471166487258 and parameters: {'C': 431.4022584640214, 'epsilon': 2.581688694428477, 'kernel': 'linear', 'norm': 'minmax'}. Best is trial 61 with value: -1.553044709891868.^[[0m
Could you tell me why Optuna is getting stuck and how can I solve the issues?
Environment
- Optuna version: 2.8.0
- Python version: 3.8
- OS: Linux CentOS 7
- (Optional) Other libraries and their versions: Scikit Learn, Pandas, and (most common libraries)
Reproducible examples
The code I am using for hypertuning
def tune(objective):
study = optuna.create_study(direction="maximize")
study.optimize(objective, n_trials=200, n_jobs=40)
params = study.best_params
return params
def knn_objective(X_train: DataFrame, y_train: DataFrame, cv_method: kfolds) -> Callable[[Trial], float]:
def objective(trial: Trial) -> float:
args: Dict = dict(
n_neighbors=trial.suggest_int("n_neighbors", 2, 40, 1),
weights=trial.suggest_categorical("weights", ["uniform", "distance"]),
metric=trial.suggest_categorical("metric", ["euclidean", "manhattan", "mahalanobis"]),
)
estimator = KNeighborsRegressor(**args)
scores = cross_validate(
estimator, X=X_train, y=y_train, scoring="neg_mean_squared_error", cv=cv_method, n_jobs=-1
)
return float(np.mean(scores["test_score"]))
return objective