'How to scale by pair of columns together

I want to use the scale function but to do it on each pair of columns - To calculate the mean on pair of columns and not on each column.

In details: This is my data for example:

phone phone1_X phone2 phone2_X phone3 phone3_X
1 2 3 4 5 6
2 4 6 8 10 12

I want to use the scale function on each pair phone1+phone1_X, Phone2+Phone2_X etc.. Each pair has the same name "phone1" but the second column always contains an additional "_X" (a different condition in the experiment).

In the end, I wish to have the original table but in Z.scores (but as I mentioned before, the mean is calculated by pair of columns and not by one column)

Thank you so much!

r


Solution 1:[1]

There might be a more elegant way, but this is how I'd do it.

library(dplyr)
library(tidyr)

df %>%
  pivot_longer(cols = -phone) %>%
  group_by(phone, name = stringr::str_extract(name, 'phone[0-9]?')) %>%
  summarise(mean_value = mean(value), .groups = 'drop') %>%
  pivot_wider(names_from = name, values_from = mean_value)

#> # A tibble: 2 × 4
#>   phone phone1 phone2 phone3
#>   <dbl>  <dbl>  <dbl>  <dbl>
#> 1     1      2    3.5    5.5
#> 2     2      4    7     11

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Aron