'How can I find the dates that events are happening concurrently in R by group?
In R, I need to find which treatments are occurring concurrently and work out what the dose for that day would be. I need to do this by patient, so presumably using a group_by statement in dplyr.
| user_id | treatment | dosage | treatment_start | treatment_end |
|---|---|---|---|---|
| 1 | 1 | 3 | 01/28/2019 | 07/30/2019 |
| 1 | 1 | 2 | 05/26/2019 | 11/25/2019 |
| 1 | 2 | 1 | 08/13/2019 | 02/12/2020 |
| 1 | 1 | 2 | 12/06/2019 | 04/07/2020 |
| 1 | 2 | 1 | 12/09/2019 | 06/10/2020 |
Ideally the final form of it will be the user id, the treatments they're on, the sum of the dosage of all treatments, and the dates that they're on all of those treatments. I've made an example results table with a few rows below.
| user_id | treatments | total_dosage | treatment_start | treatment_end |
|---|---|---|---|---|
| 1 | 1 | 3 | 01/28/2019 | 05/25/2019 |
| 1 | 1 | 5 | 05/26/2019 | 07/30/2019 |
| 1 | 1 | 2 | 07/31/2019 | 08/12/2019 |
| 1 | 1,2 | 3 | 08/13/2019 | 11/25/2019 |
I worked out how to find if an event overlaps with other events but it doesn't get the resulting dates, and doesn't sum the dosages so I don't know if it's usable. In this case, course is a combination of the treatment and dosage column.
DF %>% group_by(user_id ) %>%
mutate(overlap = purrr::map2_chr(treatment_start, treatment_end,
~toString(course[.x >= treatment_start & .x < treatment_end| .y > treatment_start & .y < treatment_end]))) %>%
ungroup()
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
