'Assigning new variable if a column takes specific values
I am trying to generate a new variable to identify 'single parents' in a household, based on a group identifier. If there is a 'Child' in a group without both a 'Head' and "Spouse', I would like the variable to take the value of 1. I have tried using dplyr but am unable to arrive at the solution.
relation<-c("Head","Spouse","Child","Head","Spouse","Head","Child")
group<-c(1,1,1,2,2,3,3)
my_data<-as.data.frame(cbind(group,relation))
my_data %>%
group_by(group) %>%
mutate(single_parent = case_when(relation %in% "Child" & !(relation %in% "Head" & relation %in% "Spouse")~1))
# desired output
my_data$single_parent<-c(0,0,0,0,0,1,1)
Thank you for your help.
Solution 1:[1]
We could do
library(dplyr)
my_data <- my_data %>%
group_by(group) %>%
mutate(single_parent = +((!all(c("Head", "Spouse") %in% relation &
'Child' %in% relation)) & 'Child' %in% relation)) %>%
ungroup
-output
my_data
# A tibble: 7 × 3
group relation single_parent
<dbl> <chr> <int>
1 1 Head 0
2 1 Spouse 0
3 1 Child 0
4 2 Head 0
5 2 Spouse 0
6 3 Head 1
7 3 Child 1
data
my_data <- data.frame(group, relation)
Solution 2:[2]
Here is another tidyverse option:
library(tidyverse)
my_data %>%
group_by(group) %>%
mutate(single_parent = ifelse(relation == "Child" & sum(n()) == 2, 1, NA)) %>%
fill(single_parent, .direction = "downup", 0) %>%
mutate(single_parent = replace_na(single_parent, 0))
Or another option using a combination of base R and tidyverse using table:
data.frame(group = unique(my_data$group), single_parent = +(table(my_data)[,1] == 1 & rowSums(table(my_data)[,-1]) == 1)) %>%
left_join(my_data, ., by = "group")
Output
group relation single_parent
<chr> <chr> <dbl>
1 1 Head 0
2 1 Spouse 0
3 1 Child 0
4 2 Head 0
5 2 Spouse 0
6 3 Head 1
7 3 Child 1
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | akrun |
| Solution 2 |
