'sklearn.exceptions.NotFittedError: This LabelEncoder instance is not fitted yet

I'm trying to run a voice recognition code from Github HERE that analyzes voice. There is an example in final_results_gender_test.ipynb that illustrates the steps both on the training and inference. So I copied and adjusted the inference part and came up with the following code that uses the trained model for just inference. But I'm not sure why I get this error, complaining This LabelEncoder instance is not fitted yet.

How to fix the problem? I'm just doing inference and why do I need the fit?

Traceback (most recent call last):
  File "C:\Users\myname\Documents\Speech-Emotion-Analyzer-master\audio.py", line 53, in <module>
    livepredictions = (lb.inverse_transform((liveabc)))
  File "C:\Users\myname\AppData\Local\Continuum\anaconda3\lib\site-packages\sklearn\preprocessing\label.py", line 272, in inverse_transform
    check_is_fitted(self, 'classes_')
  File "C:\Users\myname\AppData\Local\Continuum\anaconda3\lib\site-packages\sklearn\utils\validation.py", line 914, in check_is_fitted
    raise NotFittedError(msg % {'name': type(estimator).__name__})
sklearn.exceptions.NotFittedError: This LabelEncoder instance is not fitted yet. Call 'fit' with appropriate arguments before using this method.

Here is my copied/adjusted code from the notebook:

import os
from keras import regularizers
import keras
from keras.callbacks import ModelCheckpoint
from keras.layers import Conv1D, MaxPooling1D, AveragePooling1D, Dense, Embedding, Input, Flatten, Dropout, Activation, LSTM
from keras.models import Model, Sequential, model_from_json
from keras.preprocessing import sequence
from keras.preprocessing.sequence import pad_sequences
from keras.preprocessing.text import Tokenizer
from keras.utils import to_categorical
import librosa
import librosa.display
from matplotlib.pyplot import specgram
from sklearn.metrics import confusion_matrix
from sklearn.preprocessing import LabelEncoder
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import tensorflow as tf

opt = keras.optimizers.rmsprop(lr=0.00001, decay=1e-6)
lb = LabelEncoder()

json_file = open('model.json', 'r')
loaded_model_json = json_file.read()
json_file.close()
loaded_model = model_from_json(loaded_model_json)
# load weights into new model
loaded_model.load_weights("saved_models/Emotion_Voice_Detection_Model.h5")
print("Loaded model from disk")
 
X, sample_rate = librosa.load('h04.wav', res_type='kaiser_fast',duration=2.5,sr=22050*2,offset=0.5)
sample_rate = np.array(sample_rate)
mfccs = np.mean(librosa.feature.mfcc(y=X, sr=sample_rate, n_mfcc=13),axis=0)
featurelive = mfccs
livedf2 = featurelive
livedf2= pd.DataFrame(data=livedf2)
livedf2 = livedf2.stack().to_frame().T
twodim= np.expand_dims(livedf2, axis=2)
livepreds = loaded_model.predict(twodim, batch_size=32, verbose=1)

livepreds1=livepreds.argmax(axis=1)
liveabc = livepreds1.astype(int).flatten()
livepredictions = (lb.inverse_transform((liveabc)))
print(livepredictions)


Solution 1:[1]

I was facing the same problem. It is probably too late for you. But I want to give solution for who has still errors.

I was using this code Github

at read.me file you can see this note: NOTE: If you are using the model directly and want to decode the output ranging from 0 to 9 then the following list will help you.

So since he already gave that list, just delete this part from your code:

livepredictions = (lb.inverse_transform((liveabc)))
print(livepredictions)

Since the speechs are already loaded, we don't need to fit it or transform it.

So instead of those lines add these: I prefer to use as dictionary, then print it from there.

Sentiments = { 0 : "Female_angry",
               1 : "Female Calm",
               2 : "Female Fearful",
               3 : "Female Happy",
               4 : "Female Sad",
               5 : "Male Angry",
               6 : "Male calm",
               7 : "Male Fearful",
               8 : "Male Happy",
               9 : "Male sad"
}

Way1: Use generator to get value from dictionary.

Result = [emotions for (number,emotions) in Sentiments.items() if liveabc == number]
print(Result)

Way2: Or simply get exact value from dictionary.

for number,emotions in Sentiments.items():
     if liveabc == number:
         print(emotions)

if you use way 1: it will show you as ['Male Angry']. if you use way 2: it will print as Male Angry.

So the full code will be like this:

from keras.models import model_from_json
import librosa
import numpy as np
import pandas as pd


    Sentiments = { 0 : "Female_angry",
                   1 : "Female Calm",
                   2 : "Female Fearful",
                   3 : "Female Happy",
                   4 : "Female Sad",
                   5 : "Male Angry",
                   6 : "Male calm",
                   7 : "Male Fearful",
                   8 : "Male Happy",
                   9 : "Male sad"
    }
    
    json_file = open('model.json', 'r')
    loaded_model_json = json_file.read()
    json_file.close()
    loaded_model = model_from_json(loaded_model_json)
    # load weights into new model
    loaded_model.load_weights("saved_models/Emotion_Voice_Detection_Model.h5")
    print("Loaded model from disk")
    
    
    X, sample_rate = librosa.load('output10.wav', res_type='kaiser_fast',duration=2.5,sr=22050*2,offset=0.5)
    sample_rate = np.array(sample_rate)
    mfccs = np.mean(librosa.feature.mfcc(y=X, sr=sample_rate, n_mfcc=13),axis=0)
    featurelive = mfccs
    livedf2 = featurelive
    livedf2= pd.DataFrame(data=livedf2)
    livedf2 = livedf2.stack().to_frame().T
    twodim= np.expand_dims(livedf2, axis=2)
    livepreds = loaded_model.predict(twodim, 
                             batch_size=32, 
                             verbose=1)
    livepreds1=livepreds.argmax(axis=1)
    liveabc = livepreds1.astype(int).flatten()
    
    
    Result = [emotions for (number,emotions) in Sentiments.items() if liveabc == number]
    print(Result)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1