'Is there a way to use for loops (or nested for loops) to use date information from one dataframe and find the corresponding date from another df?
The following question is a rather general question. I have a data frame with certain individuals and some dates on the corresponding row. What I would like to do is using another daily data frame, find information pertaining to the consecutive days based on the date of the individual. For example, if I have individual X born on 01-01-2000 (1st df), using a function, I would like to find 01-01-2000 in the daily data frame (2nd df) and find the mean of the first 3 days post birth (namely 01-01-2000 : 05-01-2000) and then add it to a new column of the 1st df. Its not important what mean, it could be weight, sunlight hours, or number of calls. This question may be a bit vague so if someone could interpret this text, any help would be appreciated.
name<-c("A","B","C","D")
dob<-c("01-01-2000","02-01-2000","03-01-2000","08-01-2000")
df1<-data.frame(name,dob)
name dob
1 A 01-01-2000
2 B 02-01-2000
3 C 03-01-2000
4 D 08-01-2000
date<- c("31-12-1999","01-01-2000","02-01-2000","03-01-2000","04-01-2000","05-01-2000","06-01-2000","07-01-2000","08-01-2000","09-01-2000","10-01-2000","11-01-2000")
calls<-c(0,0,1,2,2,2,0,0,1,4,2,3)
df2<-data.frame(date,calls)
date calls
1 31-12-1999 0
2 01-01-2000 0
3 02-01-2000 1
4 03-01-2000 2
5 04-01-2000 2
6 05-01-2000 2
7 06-01-2000 0
8 07-01-2000 0
9 08-01-2000 1
10 09-01-2000 4
11 10-01-2000 2
12 11-01-2000 3
What I would like is the following;
name dob mean.call
1 A 01-01-2000 1.00
2 B 02-01-2000 1.67
3 C 03-01-2000 2.00
4 D 08-01-2000 2.33
As the data frames are rather large, I would like to implement for loops.
Solution 1:[1]
I would calculate the means first using zoos rollmean function and then join df2 and df1:
library(dplyr)
library(zoo)
df2 %>%
add_row(calls = rep(0, 2)) %>%
mutate(means = rollmean(calls, k = 3, align = "left", fill = NA),
.keep = "unused") %>%
right_join(df1, by = c("date" = "dob")) %>%
select(name, date, means)
This returns
name date means
1 A 01-01-2000 1.000000
2 B 02-01-2000 1.666667
3 C 03-01-2000 2.000000
4 D 08-01-2000 2.333333
Note: I added two dummy rows into df2 to calculate the mean of the last two entries. Since there is no specific rule for those values, I choose to do so. Keep this in mind.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Martin Gal |
