'Get the label mappings from label encoder

I am using the following code to map a list of string labels to a list of one-hot-encoded values:

from sklearn.preprocessing import LabelEncoder
from sklearn.preprocessing import OneHotEncoder  
labelEncoder = LabelEncoder()
targets = ["blue","green","blue","blue","green"]    
integerEncoded = labelEncoder.fit_transform(targets)

At a later stage I need to know exactly, which string labels are mapped to which integer values.

I.e. I need something like that:

integerMapping = GetIntegerMapping(labelEncoder)

Where

integerMapping["blue"]

should return the int value to which all "blue" labels are mapped

and

integerMapping["green"]

Should return the int value to which all "green" labels are mapped.

How can I get that integerMapping dictonary?



Solution 1:[1]

There is a classes_ attribute once the label encoder is fitted. The integer used to replace the label value is the index of the label in this array. So you can get the mapping with:

le = LabelEncoder()
le.fit(targets)
integer_mapping = {l: i for i, l in enumerate(le.classes_)}

Solution 2:[2]

you can make a dictionary which maps the target and encoded integer

integerMapping=dict(zip(targets,integerEncoded))

Solution 3:[3]

Here is a simple answer:

# helper function to get the mapping between original label and encoded label
def get_label_map(df:pd.DataFrame, label:str):
    """get the mapping between original label and its encoded value
    df: a pandas dataframe with both feature variables and target variable
    label: the name of target variable
    Example:
      df0 = pd.DataFrame({'fea1':[1,2,3,4], 'fea2':['a','b','b','c'], 'target':['cat', 'cat','dog','cat']})
      label = 'target'
      label_map = get_label_map(df=df0, label='target')
    """
    from sklearn.preprocessing import LabelEncoder
    le = LabelEncoder() # init label encoder
    y_le = le.fit_transform(df[[label]]) # encode target variable
    label_map = dict(zip(df[label], y_le)) # get the mapping between the original labels and encoded labels
    return label_map
    Example:
      df0 = pd.DataFrame({'fea1':[1,2,3,4], 'fea2':['a','b','b','c'], 'target':['cat', 'cat','dog','monkey']})

      label_map = get_label_map(df=df0, label='target') # {'cat': 0, 'dog': 1, 'monkey': 2}

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Alexis M
Solution 2
Solution 3