'Predict() new data into PCA space in R
After performing a principal component analysis of a first data set (a), I projected a second data set (b) into PCA space of the first data set.
From this, I want to extract the variable loadings for the projected analysis of (b). Variable loadings of the PCA of (a) are returned by prcomp(). How can I retrieve the variable loadings of (b), projected into PCA space of (a)?
# set seed and define variables
set.seed(1)
a = replicate(10, rnorm(10))
b = replicate (10, rnorm(10))
# pca of data A and project B into PCA space of A
pca.a = prcomp(a)
project.b = predict(pca.a, b)
# variable loadings
loads.a = pca.a$rotation
Solution 1:[1]
Here's an annotated version of your code to make it clear what is happening at each step. First, the original PCA is performed on matrix a:
pca.a = prcomp(a)
This calculates the loadings for each principal component (PC). At the next step, these loadings together with a new data set, b, are used to calculate PC scores:
project.b = predict(pca.a, b)
So, the loadings are the same, but the PC scores are different. If we look at project.b, we see that each column corresponds to a PC:
PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8
[1,] -0.2922447 0.10253581 0.55873366 1.3168437 1.93686163 0.998935945 2.14832483 -1.43922296
[2,] 0.1855480 -0.97631967 -0.06419207 0.6375200 -1.63994127 0.110028191 -0.27612541 -0.37640710
[3,] -1.5924242 0.31368878 -0.63199409 -0.2535251 0.59116005 0.214116915 1.20873962 -0.64494388
[4,] 1.2117977 0.29213928 1.53928110 -0.7755299 0.16586295 0.030802395 0.63225374 -1.72053189
[5,] 0.5637298 0.13836395 -1.41236348 0.2931681 -0.64187233 1.035226594 0.67933996 -1.05234872
[6,] 0.2874210 1.18573157 0.04358772 -1.1941734 -0.04399808 -0.113752847 -0.33507195 -1.34592414
[7,] 0.5629731 -1.02835365 0.36218131 1.4117908 -0.96923175 -1.213684882 0.02221423 1.14483112
[8,] 1.2854406 0.09373952 -1.46038333 0.6885674 0.39455369 0.756654205 1.97699073 -1.17281174
[9,] 0.8573656 0.07810452 -0.06576772 -0.5200661 0.22985518 0.007571489 2.29289637 -0.79979214
[10,] 0.1650144 -0.50060018 -0.14882996 0.2065622 2.79581428 0.813803739 0.71632238 0.09845912
PC9 PC10
[1,] -0.19795112 0.7914249
[2,] 1.09531789 0.4595785
[3,] -1.50564724 0.2509829
[4,] 0.05073079 0.6066653
[5,] -1.62126318 0.1959087
[6,] 0.14899277 2.9140809
[7,] 1.81473300 0.0617095
[8,] 1.47422298 0.6670124
[9,] -0.53998583 0.7051178
[10,] 0.80919039 1.5207123
Hopefully, that makes sense, but I'm yet to finish my first coffee of the day, so no guarantees.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Lyngbakr |
