'multiclass classification: explore correlation among classes
I am having at hand a classification task for a multiclass imbalanced problem. As an attempt to dig into the dataset, I want to explore correlations among available classes to see how well the classes are separated and potential mixes.
Below, I give an example with pseudo-dataset having 5-classes.
from collections import Counter
from sklearn.datasets import make_classification
X, y = make_classification(1000, n_classes=5, n_informative=10, weights=[.1, .13, .15, .17, .45])
class_suport = Counter(y)
for key, value in sorted(class_suport.items()):
print(f'Class: {key}, support: {value}')
Class: 0, support: 101
Class: 1, support: 133
Class: 2, support: 148
Class: 3, support: 168
Class: 4, support: 450
So I want to visualise these classes' boundaries do understand their separability but I have no idea how this could be done using matplotlib or seaborn.
I may have to do this with to some features (more relevant features) as well, but having the general idea of class separability visualisation would help get started.
Solution 1:[1]
You can use PCA decomposition to reduce number of dimensions to n=3 and plot as 3D plot.
from sklearn.decomposition import PCA
pca = PCA(n_components=3)
pca.fit(X)
X_out=pca.fit_transform(X)
X_out.shape
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | desertnaut |
