'Get criteria for decision (binary classifier) of trained model when making prediction on sample input

I have a trained RandomForestClassifier from sklearn. I trained the model on a set of data. Each data sample consists of 10 columns (Bank Loan Approval data) and the model has to spit out if the loan is approved or not. It seems to work ok, and when I test the model on new data it makes the correct decision.

So what I do is I first load the trained model from a file and the prepared data set. I also drop the result column ('LoanApproved') from the data set:

import pandas as pd
import pickle

model = pickle.load(open('trained_model.sav', 'rb'))

data = pd.read_csv('clean_data.csv', index_col=0)
cols=['LoanApproved']
data.drop(columns=cols, axis=1, inplace=True)

This data set contains cleaned/prepared data for model validation.

I then take a random row (data sample) from the set and put it into the trained model to get a prediction:

single_entry = data.iloc[55379].values.reshape(1, -1)
result = model.predict(single_entry)
print(result)

The result is a 1 or a 0, depending on the prediction. So now I want to get the reason behind the model's prediction. Which of the parameters in the table below were the most important in this (single) case.

I can extract the importance of individual parameters from the trained model like:

feature_importances = pd.DataFrame(model.feature_importances_, index = x_train.columns,
                                  columns=['importance']).sort_values('importance', ascending=False)

which gives me this table with parameters sorted in ascending order:

importance
ApplicantCreditHistory 0.838678
LoanIntRate 0.046705
LoanAmountLog 0.038982
ApplicantIncomeLog 0.031434
ApplicantEmplLength 0.013662
ApplicantDependents 0.009104
LoanTermLog 0.006683
ApplicantHomeOwn 0.006161
ApplicantMarried 0.005745
ApplicantSelfEmployed 0.002846

But when I later input a single data sample and it gives me a 1 or a 0 (True or False) I want to know what this decision is based on.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source