'Window function in R
I have table
DF <- tibble(
CLIENT = c(1,1,1,2,2,2,3,3,3),
N_DOG = c('N1','N2','N3','N4','N5','N6','N7','N8','N9'),
DT = as.Date(c('01.06.2021','12.07.2021','04.05.2021',
'03.08.2021','21.07.2021','04.01.2022',
'07.04.2021','09.11.2021','08.12.2021'), format = '%d.%m.%Y'),
RANK = c(1,1,2,3,1,1,2,2,3)
)
| CLIENT | N_DOG | DT | RANK |
|---|---|---|---|
| 1 | N1 | 2021-06-01 | 1 |
| 1 | N2 | 2021-07-12 | 1 |
| 1 | N3 | 2021-05-04 | 2 |
| 2 | N4 | 2021-08-03 | 3 |
| 2 | N5 | 2021-07-21 | 1 |
| 2 | N6 | 2022-01-04 | 1 |
| 3 | N7 | 2021-04-07 | 2 |
| 3 | N8 | 2021-11-09 | 2 |
| 3 | N9 | 2021-12-08 | 3 |
I want to find a contract for each client according to the following criteria:
- Find a contract with a minimum rank
- If there are several such contracts, we take the last one.
PS. The output must be a unique client and 1 contract
Realization :
MIN_RANK <- DF %>%
select(CLIENT, RANK) %>%
group_by(CLIENT) %>%
filter(RANK == min(RANK)) %>%
ungroup() %>%
distinct()
MIN_RANK %>%
inner_join(DF, by = c('CLIENT','RANK')) %>%
select(!c(N_DOG,RANK)) %>%
group_by(CLIENT) %>%
filter(DT == max(DT)) %>%
inner_join(DF, by = c('CLIENT','DT'))
Output
| CLIENT | DT | N_DOG | RANK |
|---|---|---|---|
| 1 | 2021-07-12 | N2 | 1 |
| 2 | 2022-01-04 | N6 | 1 |
| 3 | 2021-11-09 | N8 | 2 |
How can I do it faster ? [ maybe there's some function ? ] the same question at the ru forums.
English is not my native language, so please be kind to my mistakes.
Solution 1:[1]
data.table::setDT(DF)[order(RANK,-DT), .SD[1], by=CLIENT]
Output:
CLIENT N_DOG DT RANK
<num> <char> <Date> <num>
1: 2 N6 2022-01-04 1
2: 1 N2 2021-07-12 1
3: 3 N8 2021-11-09 2
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | langtang |
