'How to use scikit learn inverse_transform with new values
I have a set of data that I have used scikit learn PCA. I scaled the data before performing PCA with StandardScaler().
variance_to_retain = 0.99
np_scaled = StandardScaler().fit_transform(df_data)
pca = PCA(n_components=variance_to_retain)
np_pca = pca.fit_transform(np_scaled)
# make dataframe of scaled data
# put column names on scaled data for use later
df_scaled = pd.DataFrame(np_scaled, columns=df_data.columns)
num_components = len(pca.explained_variance_ratio_)
cum_variance_explained = np.cumsum(pca.explained_variance_ratio_)
eigenvalues = pca.explained_variance_
eigenvectors = pca.components_
I then ran K-Means clustering on the scaled dataset. I can plot the cluster centers just fine in scaled space.
My question is: how do I transform the locations of the centers back into the original data space. I know that StandardScaler.fit_transform() make the data have zero mean and unit variance. But with the new points of shape (num_clusters, num_features), can I use inverse_transform(centers) to get the centers transformed back into the range and offset of the original data?
Thanks, David
Solution 1:[1]
you can get cluster_centers on a kmeans, and just push that into your pca.inverse_transform
here's an example
import numpy as np
from sklearn import decomposition
from sklearn import datasets
from sklearn.cluster import KMeans
from sklearn.preprocessing import StandardScaler
iris = datasets.load_iris()
X = iris.data
y = iris.target
scal = StandardScaler()
X_t = scal.fit_transform(X)
pca = decomposition.PCA(n_components=3)
pca.fit(X_t)
X_t = pca.transform(X_t)
clf = KMeans(n_clusters=3)
clf.fit(X_t)
scal.inverse_transform(pca.inverse_transform(clf.cluster_centers_))
Note that sklearn has multiple ways to do the fit/transform. You can do StandardScaler().fit_transform(X) but you lose the scaler, and can't reuse it; nor can you use it to create an inverse.
Alternatively, you can do scal = StandardScaler() followed by scal.fit(X) and then by scal.transform(X)
OR you can do scal.fit_transform(X) which combines the fit/transform step
Solution 2:[2]
Here I am using SVR to Fit the data before that I am using scaling technique to scale the values and to get the prediction I am using the Inverse transform function
from sklearn.preprocessing import StandardScaler
#Creating two objects for dependent and independent variable
ss_X = StandardScaler()
ss_y = StandardScaler()
X = ss_X.fit_transform(X)
y = ss_y.fit_transform(y.reshape(-1,1))
#Creating a model object and fiting the data
reg = SVR(kernel='rbf')
reg.fit(X,y)
#To make a prediction
#First we have transform the value into scalar level
#Second inverse tranform the value to see the original value
ss_y.inverse_transform(reg.predict(ss_X.transform(np.array([[6.5]]))))
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | Ravi kumar |
