9

I trained a prediction model with Scikit Learn in Python (Random Forest Regressor) and I want to extract somehow the weights of each feature to create an excel tool for manual prediction.

The only thing that I found is the model.feature_importances_ but it doesn't help.

Is there any way to achieve it?

def performRandomForest(X_train, y_train, X_test, y_test):

    '''Perform Random Forest Regression'''

    from sklearn.ensemble  import  RandomForestRegressor

    model  =  RandomForestRegressor()
    model.fit( X_train , y_train )

    #make predictions
    expected  = y_test
    predicted  = model.predict( X_test )

    #summarize the fit of the model
    mse  = np.mean(( predicted - expected )** 2)
    accuracy = ( model.score ( X_train , y_train ))

    return model, mse, accuracy

At the moment, I use the model.predict([features]) to do it, but I need it in an excel file.

Tasos
  • 3,960
  • 5
  • 25
  • 54

3 Answers3

1

The SKompiler library might help:

from skompiler import skompile
skompile(rf.predict_proba).to('excel')

Check out this video.

KT.
  • 2,121
  • 1
  • 12
  • 10
0

Instead of exporting the weights, you can export the model to a pickle file and use a xlwings to read the data from the spreadsheet, load the pickled model and run a prediction Here's a similar questions.

Olel Daniel
  • 511
  • 3
  • 6
0

I guess you want to extract all the logic followed by the different trees to end up on the final regressor. For that, you need to extract first the logic of each tree and then extract how those paths are followed. Scikit learn can provide that through .decision_path(X), with X some dataset to predict. From here you'll get an idea on how the random forest predicts and what logic is followed at each step.

Once you extracted the decision_path, you can use Tree Interpreter to obtain the "formula" of the Random Forest you trained. I am not familiar with this Tree Interpreter, but it seems to work directly on the modeler you have trained, i.e.,

from treeinterpreter import treeinterpreter as ti
# fit a scikit-learn's regressor model

rf = RandomForestRegressor()

rf.fit(trainX, trainY)

prediction, bias, contributions = ti.predict(rf, testX)
Stephen Rauch
  • 1,831
  • 11
  • 23
  • 34
Diego
  • 21
  • 2