'Scree Plot for Kernel PCA
I am trying to do a scree plot for Kernel PCA. I have 78 features in my X with 247K samples. I am new to kernel PCA however I have utilized scree plot for linear PCA multiple times. The below code does the scree plot for linear PCA. I want to use the scree plot to decide the number of components I will need before actually fitting it in.
pca = PCA().fit(X)
plt.figure()
plt.plot(np.cumsum(pca.explained_variance_ratio_))
plt.xlabel('Number of Principle Components')
plt.ylabel('Variance (%)') #for each component
plt.title('Dataset Explained Variance')
plt.show()
I tried to replicate the same way for kernel pca but explained_variance_ratio_ method doesn't exist for kernel PCA which is why I did it the following way.
pca = KernelPCA(kernel='rbf',gamma=10,fit_inverse_transform=False).fit_transform(scaled_merged.iloc[0:1000:,])
explained_variance = np.var(pca, axis=0)
explained_variance_ratio = explained_variance / np.sum(explained_variance)
plt.figure()
plt.plot(np.cumsum(explained_variance_ratio))
plt.xlabel('Number of Components')
plt.ylabel('Variance (%)') #for each component
plt.title('Dataset Explained Variance')
plt.show()
The scree plot for kernel PCA code has some problem it shows that I need 150 components to express close to 90% variance. Is there something wrong I am doing with my code?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
