'Get criteria for decision (binary classifier) of trained model when making prediction on sample input
I have a trained RandomForestClassifier from sklearn. I trained the model on a set of data. Each data sample consists of 10 columns (Bank Loan Approval data) and the model has to spit out if the loan is approved or not. It seems to work ok, and when I test the model on new data it makes the correct decision.
So what I do is I first load the trained model from a file and the prepared data set. I also drop the result column ('LoanApproved') from the data set:
import pandas as pd
import pickle
model = pickle.load(open('trained_model.sav', 'rb'))
data = pd.read_csv('clean_data.csv', index_col=0)
cols=['LoanApproved']
data.drop(columns=cols, axis=1, inplace=True)
This data set contains cleaned/prepared data for model validation.
I then take a random row (data sample) from the set and put it into the trained model to get a prediction:
single_entry = data.iloc[55379].values.reshape(1, -1)
result = model.predict(single_entry)
print(result)
The result is a 1 or a 0, depending on the prediction. So now I want to get the reason behind the model's prediction. Which of the parameters in the table below were the most important in this (single) case.
I can extract the importance of individual parameters from the trained model like:
feature_importances = pd.DataFrame(model.feature_importances_, index = x_train.columns,
columns=['importance']).sort_values('importance', ascending=False)
which gives me this table with parameters sorted in ascending order:
| importance | |
|---|---|
| ApplicantCreditHistory | 0.838678 |
| LoanIntRate | 0.046705 |
| LoanAmountLog | 0.038982 |
| ApplicantIncomeLog | 0.031434 |
| ApplicantEmplLength | 0.013662 |
| ApplicantDependents | 0.009104 |
| LoanTermLog | 0.006683 |
| ApplicantHomeOwn | 0.006161 |
| ApplicantMarried | 0.005745 |
| ApplicantSelfEmployed | 0.002846 |
But when I later input a single data sample and it gives me a 1 or a 0 (True or False) I want to know what this decision is based on.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
