'How to flag time-varying indicators with overlapping dates in a longitudinal data set?

I have a simulated data set with 5 rows, each representing a block of person-time, each with its own start and end date ('start' and 'end').

  • Each row has a visit date associated with it ('visit'), and this is filled up until the row that contains that actual date, and then it's followed by a new visit date (eg, '2015-09-11' repeats until there's a row that contains the date of the next visit, which is '2015-09-17').
  • There is a 'mo_previsit' date that takes 'visit' minus 1 month
  • There is a 'flag_mo' variable that marks the row in which the 'mo_previsit' date falls, and a 'flag_rows' variable that will flag all rows that are contained within those 30 days

Problem: 'flag_mo' and 'flag_rows' work for the first visit date ('2015-09-11'), but not for the second visit date ('2015-09-17') - this is because they're based on rows that contain the mo_previsit, but it cannot search for that beyond the value (and grouping it differently does not seem to change this). How can I edit this code to allow it to overlap its search across visit dates when it creates 'flag_mo'?

#Load packages
pacman::p_load(dplyr, tidyr, lubridate)

#Create variables for data set 
start <- c('2015-01-01', '2015-04-04', '2015-08-13', '2015-09-11', '2015-09-17')
end <- c('2015-04-03', '2015-08-12', '2015-09-10', '2015-09-16', '2015-12-31')
visit <- c('2015-09-11', '2015-09-11', '2015-09-11', '2015-09-11', '2015-09-17')
row <- c(1, 2, 3, 4, 5)

#Populate data frame with variables
d <- cbind(row)
d <- as.data.frame(d)

#Format dates and add to data frame
d$start <- as.Date(start, format = '%Y-%m-%d')
d$end <- as.Date(end, format = '%Y-%m-%d')
d$visit <- as.Date(visit, format = '%Y-%m-%d')

d1 <- d %>%
  group_by(visit) %>%
  arrange(row) %>%
  #Calculate 'mo_previsit', which is the date that occurs 1 month before each visit date
  mutate(mo_previsit = visit %m-% months(1),
  #Create a flag to mark the row that contains the start of that month before each visit
         flag_mo = ifelse(((mo_previsit >= start) & (mo_previsit <= end)), 1, NA)) %>%
  group_by(visit, flag_mo) %>%
  arrange(visit) %>%
  #Create a new flag so that if the visit date is the same as the start date of a given row, 
  #we don't want to count that row as part of the 1 month that comes before the visit date
  mutate(flag_rows = ifelse(visit == start, 0, flag_mo)) %>%
  ungroup()

class(d1$mo_previsit) <- 'Date'
d1
#> # A tibble: 5 × 7
#>     row start      end        visit      mo_previsit flag_mo flag_rows
#>   <dbl> <date>     <date>     <date>     <date>        <dbl>     <dbl>
#> 1     1 2015-01-01 2015-04-03 2015-09-11 2015-08-11       NA        NA
#> 2     2 2015-04-04 2015-08-12 2015-09-11 2015-08-11        1         1
#> 3     3 2015-08-13 2015-09-10 2015-09-11 2015-08-11       NA        NA
#> 4     4 2015-09-11 2015-09-16 2015-09-11 2015-08-11       NA         0
#> 5     5 2015-09-17 2015-12-31 2015-09-17 2015-08-17       NA         0

Created on 2022-05-13 by the reprex package (v2.0.1)



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source