'Predict() new data into PCA space in R

After performing a principal component analysis of a first data set (a), I projected a second data set (b) into PCA space of the first data set.

From this, I want to extract the variable loadings for the projected analysis of (b). Variable loadings of the PCA of (a) are returned by prcomp(). How can I retrieve the variable loadings of (b), projected into PCA space of (a)?

# set seed and define variables
set.seed(1)
a = replicate(10, rnorm(10))
b = replicate (10, rnorm(10))

# pca of data A and project B into PCA space of A
pca.a = prcomp(a)
project.b = predict(pca.a, b)

# variable loadings
loads.a = pca.a$rotation


Solution 1:[1]

Here's an annotated version of your code to make it clear what is happening at each step. First, the original PCA is performed on matrix a:

pca.a = prcomp(a)

This calculates the loadings for each principal component (PC). At the next step, these loadings together with a new data set, b, are used to calculate PC scores:

project.b = predict(pca.a, b)

So, the loadings are the same, but the PC scores are different. If we look at project.b, we see that each column corresponds to a PC:

            PC1         PC2         PC3        PC4         PC5          PC6         PC7         PC8
 [1,] -0.2922447  0.10253581  0.55873366  1.3168437  1.93686163  0.998935945  2.14832483 -1.43922296
 [2,]  0.1855480 -0.97631967 -0.06419207  0.6375200 -1.63994127  0.110028191 -0.27612541 -0.37640710
 [3,] -1.5924242  0.31368878 -0.63199409 -0.2535251  0.59116005  0.214116915  1.20873962 -0.64494388
 [4,]  1.2117977  0.29213928  1.53928110 -0.7755299  0.16586295  0.030802395  0.63225374 -1.72053189
 [5,]  0.5637298  0.13836395 -1.41236348  0.2931681 -0.64187233  1.035226594  0.67933996 -1.05234872
 [6,]  0.2874210  1.18573157  0.04358772 -1.1941734 -0.04399808 -0.113752847 -0.33507195 -1.34592414
 [7,]  0.5629731 -1.02835365  0.36218131  1.4117908 -0.96923175 -1.213684882  0.02221423  1.14483112
 [8,]  1.2854406  0.09373952 -1.46038333  0.6885674  0.39455369  0.756654205  1.97699073 -1.17281174
 [9,]  0.8573656  0.07810452 -0.06576772 -0.5200661  0.22985518  0.007571489  2.29289637 -0.79979214
[10,]  0.1650144 -0.50060018 -0.14882996  0.2065622  2.79581428  0.813803739  0.71632238  0.09845912
              PC9      PC10
 [1,] -0.19795112 0.7914249
 [2,]  1.09531789 0.4595785
 [3,] -1.50564724 0.2509829
 [4,]  0.05073079 0.6066653
 [5,] -1.62126318 0.1959087
 [6,]  0.14899277 2.9140809
 [7,]  1.81473300 0.0617095
 [8,]  1.47422298 0.6670124
 [9,] -0.53998583 0.7051178
[10,]  0.80919039 1.5207123

Hopefully, that makes sense, but I'm yet to finish my first coffee of the day, so no guarantees.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Lyngbakr