'Correlation between Values in R
need some help with some R scripting.
I import a Dataframe via read.csv which looks like that:

Each value is a PlayerID and each row is a lineup.
My goal is to have a correlation of each lineup (row) between PlayerIDs. Basically I want to calculate which combinations of playerids are most common used per lineup (row).
I tried the correlate() function but it will only correlate the columns.
Thanks in advance!
Edit: dput was asked:
> dput(df)
structure(list(D = c(22119364L, 22119376L, 22119376L, 22119372L,
22119372L, 22119372L, 22119372L, 22119373L, 22119372L, 22119372L
), D.1 = c(22119376L, 22119394L, 22119372L, 22119371L, 22119373L,
22119371L, 22119371L, 22119371L, 22119373L, 22119371L), D.2 = c(22119394L,
22119386L, 22119373L, 22119380L, 22119375L, 22119381L, 22119375L,
22119395L, 22119380L, 22119381L), D.3 = c(22119386L, 22119378L,
22119375L, 22119370L, 22119371L, 22119370L, 22119380L, 22119375L,
22119375L, 22119370L), D.4 = c(22119378L, 22119365L, 22119396L,
22119375L, 22119395L, 22119375L, 22119396L, 22119370L, 22119395L,
22119375L), D.5 = c(22119370L, 22119368L, 22119371L, 22119395L,
22119379L, 22119395L, 22119370L, 22119380L, 22119370L, 22119396L
)), class = "data.frame", row.names = c(NA, -10L))
Edit: Combination question: Also may I ask are you want to see which pair (2 players) are more popular or which combinations (any comb from 2 to 5)
Any combination would be interesting
Edit: Have you tried transposing your data before computing the correlation? cor(t(df))
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 1.00000000 0.3467029 -0.01573126 -0.1370310 -0.03689795 -0.1049490 0.2446885 0.6252798 0.3330059 -0.1159602
[2,] 0.34670287 1.0000000 -0.57933025 -0.4193527 -0.65817923 -0.4005822 -0.5205894 0.2304352 -0.3852697 -0.4037538
[3,] -0.01573126 -0.5793303 1.00000000 -0.2454127 0.88159245 -0.2529465 0.9503422 -0.4211406 0.9226407 -0.2566678
[4,] -0.13703100 -0.4193527 -0.24541275 1.0000000 0.19802504 0.9991003 -0.2804525 0.4335783 -0.2075516 0.9996144
[5,] -0.03689795 -0.6581792 0.88159245 0.1980250 1.00000000 0.1907837 0.8156830 -0.2392496 0.8537408 0.1869590
[6,] -0.10494905 -0.4005822 -0.25294651 0.9991003 0.19078368 1.0000000 -0.2834194 0.4701357 -0.2003660 0.9998520
[7,] 0.24468850 -0.5205894 0.95034220 -0.2804525 0.81568296 -0.2834194 1.0000000 -0.3059578 0.9415746 -0.2878249
[8,] 0.62527976 0.2304352 -0.42114058 0.4335783 -0.23924963 0.4701357 -0.3059578 1.0000000 -0.1066355 0.4581980
[9,] 0.33300586 -0.3852697 0.92264065 -0.2075516 0.85374076 -0.2003660 0.9415746 -0.1066355 1.0000000 -0.2092557
[10,] -0.11596019 -0.4037538 -0.25666781 0.9996144 0.18695898 0.9998520 -0.2878249 0.4581980 -0.2092557 1.0000000
This helps seeing coorelation between the lineups. Where I'm just hanging, seeing correlations per playerid through all lineups. Is it possible that I have the same output but instead of lineups(rows) per playerid (value) ?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
