'How to rank a variable in a column based on a conditional, when there are NAs in the column
I have a longitudinal data set with two people in which the rows of data are numbered as 'episodes', and some episodes have a test 'result'. The goal of the below code is to:
- Create binary variable 'sup' to evaluate a 'result'. If result == NA, then sup == NA. This code works.
- Create sup_rank to enumerate the occurrence of sup == 1 within people who had an occurrence of sup==1. In other words, I want to know if this is the first time, second time, etc. that sup==1. Problem: This code currently does not work since person 2's first sup==1 is ranked as '2' (when it should be ranked as '1').
- Create an event variable that:
- equals 1 if sup_rank==1
- equals 0 if sup == 0 OR sup_rank does not equal 1
- equals NA if sup (and thus sup_rank) equals NA
Currently I tried to do #3 in two steps with event and event final. Problem: it does not work because 'sup_rank' does not work, but regardless, it would be ideal to create 'event' as one variable (and not need an 'event_final').
#Load packages
pacman::p_load(dplyr)
#Create variables for data set
person <- c(1, 1, 2, 2, 2, 2, 2, 2, 2, 2)
episode <- c(1, 2, 1, 2, 3, 4, 5, 6, 7, 8)
result <- c(NA, NA, NA, 1, NA, 2, NA, 2, NA, 2)
#Populate data frame with variables
d <- cbind(person, episode, result)
d <- as.data.frame(d)
#Manipulate data frame to create 4 new variables
d1 <- d %>%
#Need to create new variables within each person
group_by(person) %>%
#Need to correctly order the rows of data before creating the variables
arrange(person, episode) %>%
#Create variable to evaluate 'result'
mutate(sup = if_else(result == 2, 1, 0, NA_real_)) %>%
#if sup == 1, rank it
mutate(sup_rank = ifelse(sup == 1, rank(sup == 1, na.last = 'keep', ties.method = 'first'), NA_real_)) %>%
#create an event if the rank of the sup == 1 is equal to 1 (we want the initial suppression)
mutate(event = if_else(sup_rank == 1, 1, 0, NA_real_)) %>%
#now override the value of event to be equal to 0 if sup==0
mutate(event_final = if_else(sup == 0, 0, event)) %>%
arrange(person, episode)
print(d1)
#> # A tibble: 10 x 7
#> # Groups: person [2]
#> person episode result sup sup_rank event event_final
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 1 1 NA NA NA NA NA
#> 2 1 2 NA NA NA NA NA
#> 3 2 1 NA NA NA NA NA
#> 4 2 2 1 0 NA NA 0
#> 5 2 3 NA NA NA NA NA
#> 6 2 4 2 1 2 0 0
#> 7 2 5 NA NA NA NA NA
#> 8 2 6 2 1 3 0 0
#> 9 2 7 NA NA NA NA NA
#> 10 2 8 2 1 4 0 0
Created on 2022-04-20 by the reprex package (v2.0.0)
Solution 1:[1]
There is a more efficient way to do this for sure, but in the meantime, here's a solution I created:
#Load packages
pacman::p_load(dplyr)
#Create variables for data set
person <- c(1, 1, 2, 2, 2, 2, 2, 2, 2, 2)
episode <- c(1, 2, 1, 2, 3, 4, 5, 6, 7, 8)
result <- c(NA, NA, NA, 1, NA, 2, NA, 2, NA, 2)
#Populate data frame with variables
d <- cbind(person, episode, result)
d <- as.data.frame(d)
#Manipulate data frame to create 5 new variables
d1 <- d %>%
#Need to create new variables within each person
group_by(person) %>%
#Need to correctly order the rows of data before creating the variables
arrange(person, episode) %>%
#Create variable to evaluate 'result'
mutate(sup = if_else(result == 2, 1, 0, NA_real_)) %>%
#Create a flag for each time sup==1
mutate(sup_flag = if_else(sup == 1, 1, NA_real_, NA_real_)) %>%
#if sup == 1, rank it
mutate(sup_rank = ifelse(sup == 1, rank(sup_flag, na.last = 'keep', ties.method = 'first'), NA_real_)) %>%
#create an event if the rank of the sup == 1 is equal to 1 (we want the initial suppression)
mutate(event = if_else(sup_rank == 1, 1, 0, NA_real_)) %>%
#now override the value of event to be equal to 0 if sup==0
mutate(event_final = if_else(sup == 0, 0, event)) %>%
arrange(person, episode)
print(d1)
#> # A tibble: 10 x 8
#> # Groups: person [2]
#> person episode result sup sup_flag sup_rank event event_final
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 1 1 NA NA NA NA NA NA
#> 2 1 2 NA NA NA NA NA NA
#> 3 2 1 NA NA NA NA NA NA
#> 4 2 2 1 0 NA NA NA 0
#> 5 2 3 NA NA NA NA NA NA
#> 6 2 4 2 1 1 1 1 1
#> 7 2 5 NA NA NA NA NA NA
#> 8 2 6 2 1 1 2 0 0
#> 9 2 7 NA NA NA NA NA NA
#> 10 2 8 2 1 1 3 0 0
Created on 2022-04-22 by the reprex package (v2.0.0)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | ryorlets |
