'How to change iteratively the values of a column in R without a loop?
I would like to change iteratively the values of a column (value2 in the example). value2 at time i is conditioned by value1 and updated value2 at time i and i-1.
Time values are stocked in ascending order.
Treatment is done separetely for each value of the group colum.
But as describe on my example, I can't succeed to update value2 with accumulate2 (purrr package).
Maybe someone could give me some advices to do this.
Thank you.
input <- data.frame(group=c(1,1,1,2,2,2,2),
time=c(1,2,3,1,2,3,4),
value1=c(4,2,2,3,3,3,3),
value2=c(4,2,1,3,3,1,1))
input<-arrange(input, group,time)
my_function <- function(df) {
df %>%
as_tibble() %>%
group_by(group) %>%
mutate(value2=purrr::accumulate2(.x = value2, .y = ((value1==lag(value1))
& (lag(value2)==value1)
& (value1!=value2))[-1],
.f = function(.i_1, .i, .y) {
if (.y) {.i_1} else {.i}
}) %>% unlist())
}
> input
group time value1 value2
1 1 1 4 4
2 1 2 2 2
3 1 3 2 1
4 2 1 3 3
5 2 2 3 3
6 2 3 3 1
7 2 4 3 1
output <- my_function(input)
> output
group time value1 value2
1 1 1 4 4
2 1 2 2 2
3 1 3 2 2 -> data change (OK)
4 2 1 3 3
5 2 2 3 3
6 2 3 3 3 -> data change (OK)
7 2 4 3 1 -> no data change / should be replaced by 3
Solution 1:[1]
It seems that your problem lies in your algorithm. Unfortunately, as you didn't explain it here, we cannot help you in that matter.
purrr::accumulate2 can be hard to use, so I advise you to split your code as much as possible. This will make your code much more readable, and will make debugging and finding errors much easier.
For instance, consider this:
library(tidyverse)
input <- data.frame(group=c(1,1,1,2,2,2,2),
time=c(1,2,3,1,2,3,4),
value1=c(4,2,2,3,3,3,3),
value2=c(4,2,1,3,3,1,1))
input <- arrange(input, group,time)
#document your functions when it
#' @param .i_1 this is ...
#' @param .i this is ...
#' @param .y this is ...
my_accu_function = function(.i_1, .i, .y) {
if(.y) {.i_1} else {.i}
}
my_function <- function(df) {
df %>%
as_tibble() %>%
group_by(group) %>%
mutate(
cond = value1==lag(value1) &
lag(value2)==value1 &
value1!=value2,
value2_update=purrr::accumulate2(.x = value2,
.y = cond[-1],
.f = my_accu_function) %>% unlist()
)
}
input
#> group time value1 value2
#> 1 1 1 4 4
#> 2 1 2 2 2
#> 3 1 3 2 1
#> 4 2 1 3 3
#> 5 2 2 3 3
#> 6 2 3 3 1
#> 7 2 4 3 1
output = my_function(input)
output
#> # A tibble: 7 x 6
#> # Groups: group [2]
#> group time value1 value2 cond value2_update
#> <dbl> <dbl> <dbl> <dbl> <lgl> <dbl>
#> 1 1 1 4 4 FALSE 4
#> 2 1 2 2 2 FALSE 2
#> 3 1 3 2 1 TRUE 2
#> 4 2 1 3 3 FALSE 3
#> 5 2 2 3 3 FALSE 3
#> 6 2 3 3 1 TRUE 3
#> 7 2 4 3 1 FALSE 1
stopifnot(output$value2_update[7]==3)
#> Error: output$value2_update[7] == 3 is not TRUE
Created on 2022-05-11 by the reprex package (v2.0.1)
You can see that cond is FALSE in the end, so accumulate2 did its job putting the current value 1 and not the previous value 3.
If you explain your algorithm to us, maybe we can help you with setting the proper condition cond so that you get the right output.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Dan Chaltiel |
