'How to sum across rows with all NAs to be 0/NA
I have a dataframe:
dat <- data.frame(X1 = c(0, NA, NA),
X2 = c(1, NA, NA),
X3 = c(1, NA, NA),
Y1 = c(1, NA, NA),
Y2 = c(NA, NA, NA),
Y3 = c(0, NA, NA))
I want to create a composite score for X and Y variables. This is what I have so far:
clean_dat <- dat %>% rowwise() %>% mutate(X = sum(c(X1, X2, X3), na.rm = T),
Y = sum(c(Y1, Y2, Y3), na.rm = T))
However, I want the composite score for the rows with all NAs (i.e. rows 2 and 3) to be 0 in the column X and Y. Does anyone know how to do this?
Edit: I'd like to know how I can make X and Y in rows 2 and 3 NA too.
Thanks so much!
Solution 1:[1]
By default, sum or rowSums return 0 when we use na.rm = TRUE and when all the elements are NA. To prevent this either use an if/else or case_when approach i.e. determine whether there are any non-NA elements with if_any, then take the rowSums of the concerned columns within case_when (by default the TRUE will return NA)
library(dplyr)
dat %>%
mutate(X = case_when(if_any(starts_with('X'), complete.cases)
~ rowSums(across(starts_with('X')), na.rm = TRUE)),
Y = case_when(if_any(starts_with('Y'), complete.cases) ~
rowSums(across(starts_with('Y')), na.rm = TRUE)) )
-output
X1 X2 X3 Y1 Y2 Y3 X Y
1 0 1 1 1 NA 0 2 1
2 NA NA NA NA NA NA NA NA
3 NA NA NA NA NA NA NA NA
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
