'Identify Labels wrt categories after label encoding
I have label encoded a part of my data. Now I want to identify what label was given to which category.
below mentioned is the Label encoder code and its fit and transform on dataframe df creating df1.
from sklearn.preprocessing import LabelEncoder
from sklearn.pipeline import Pipeline
class MultiColumnLabelEncoder:
def __init__(self,columns = None):
self.columns = columns # array of column names to encode
def fit(self,X,y=None):
return self # not relevant here
def transform(self,X):
'''
Transforms columns of X specified in self.columns using
LabelEncoder(). If no columns specified, transforms all
columns in X.
'''
output = X.copy()
if self.columns is not None:
for col in self.columns:
output[col] = LabelEncoder().fit_transform(output[col])
else:
for colname,col in output.iteritems():
output[colname] = LabelEncoder().fit_transform(col)
return output
def fit_transform(self,X,y=None):
return self.fit(X,y).transform(X)
df1 = MultiColumnLabelEncoder(columns = ['EntryTerm','DEPENDENCYCODE']).fit_transform(df)
Here EntryTerm has two categories and DEPENDENCYCODE has multiple categories.
I want to identify if EntryTerm = 082021 was assigned 0 or 1 as label. And if DEPENCENCYCODE = 'B' was assigned 0, 1, 2 ,3 or 4 label.
Thanks.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
