'How to scale by pair of columns together
I want to use the scale function but to do it on each pair of columns - To calculate the mean on pair of columns and not on each column.
In details: This is my data for example:
| phone | phone1_X | phone2 | phone2_X | phone3 | phone3_X |
|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5 | 6 |
| 2 | 4 | 6 | 8 | 10 | 12 |
I want to use the scale function on each pair phone1+phone1_X, Phone2+Phone2_X etc.. Each pair has the same name "phone1" but the second column always contains an additional "_X" (a different condition in the experiment).
In the end, I wish to have the original table but in Z.scores (but as I mentioned before, the mean is calculated by pair of columns and not by one column)
Thank you so much!
Solution 1:[1]
There might be a more elegant way, but this is how I'd do it.
library(dplyr)
library(tidyr)
df %>%
pivot_longer(cols = -phone) %>%
group_by(phone, name = stringr::str_extract(name, 'phone[0-9]?')) %>%
summarise(mean_value = mean(value), .groups = 'drop') %>%
pivot_wider(names_from = name, values_from = mean_value)
#> # A tibble: 2 × 4
#> phone phone1 phone2 phone3
#> <dbl> <dbl> <dbl> <dbl>
#> 1 1 2 3.5 5.5
#> 2 2 4 7 11
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Aron |
