'How to get proportion of specific value across columns?
I have a sample dataframe as below:
self race1 race2 race3 race4
1 1 2 2 1
2 1 1 1 1
3 1 3 1 1
4 2 1 3 1
I would like to get the proportion of 1s in the race columns as a new column. So for each row, I would count the number of 1 and divide it by 4. The desired output dataframe would look like below.
self race1 race2 race3 race4 prop_race_as1
1 1 2 2 1 2/4
2 1 1 1 1 4/4
3 1 3 1 1 3/4
4 2 1 3 1 2/4
How do I write a function that incorporate rowwise() to get the desired output?
Solution 1:[1]
Please find below two possibilities.
Reprex
1. With dplyr (and rowwise())
- Code
library(dplyr)
df %>%
dplyr::rowwise() %>%
dplyr::mutate(prop_race_as1 = sum(c_across(starts_with("race")) < 2) / 4)
- Output
#> # A tibble: 4 x 6
#> # Rowwise:
#> self race1 race2 race3 race4 prop_race_as1
#> <int> <int> <int> <int> <int> <dbl>
#> 1 1 1 2 2 1 0.5
#> 2 2 1 1 1 1 1
#> 3 3 1 3 1 1 0.75
#> 4 4 2 1 3 1 0.5
2. Using only base R
- Code
df$prop_race_as1 <- rowSums(df[startsWith(names(df), "race")] < 2) / 4
- Output
df
#> self race1 race2 race3 race4 prop_race_as1
#> 1 1 1 2 2 1 0.50
#> 2 2 1 1 1 1 1.00
#> 3 3 1 3 1 1 0.75
#> 4 4 2 1 3 1 0.50
Data
df <- structure(list(self = 1:4, race1 = c(1L, 1L, 1L, 2L), race2 = c(2L,
1L, 3L, 1L), race3 = c(2L, 1L, 1L, 3L), race4 = c(1L, 1L, 1L,
1L)), class = "data.frame", row.names = c(NA, -4L))
Created on 2022-02-16 by the reprex package (v2.0.1)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
