'Value of one variable, where two other variables are equal in R
I am still quite new to R and I have a simple (I guess) question: I would like to divide the value in one column (variable Sum_Output) in the first year of a program by the value in the same column in the second year of a program.
I have identified the first and the second year. Now I fail to tell R
"Use the value of column 6 where Year = Firstyear and divide it by the value of column 6 where Year=Secondyear."
I tried the following approach, but get the error, that
longer object length is not a multiple of shorter object length.
So I guess, this operation does not achieve to only calculate the ratio for those rows, where the condition is fulfilled. How could I include this in my approach or do you have an easier way to calculate the ratio of sum_outputs in the first and the second year?
Thanks a lot for your time and ideas how to do this!
data <- structure(list(ID = c("Program A", "Program A", "Program A",
"Program A", "Program B", "Program B", "Program B", "Program B",
"Program C", "Program C", "Program C", "Program C", "Program D",
"Program D", "Program D", "Program D"), Year = c(2019, 2020,
2021, 2022, 2019, 2020, 2021, 2022, 2019, 2020, 2021, 2022, 2019,
2020, 2021, 2022), Input = c(0, 14, 10, 7, 0, 11, 20, 6, 0, 0,
3, 6, 0, 0, 0, 6), Firstyear = c(2020, 2020, 2020, 2020, 2020,
2020, 2020, 2020, 2021, 2021, 2021, 2021, 2022, 2022, 2022, 2022
), Secondyear = c(2021, 2021, 2021, 2021, 2021, 2021,
2021, 2021, 2022, 2022, 2022, 2022, 2022, 2022, 2022,
"NA"), Sum_Ouput = c(0, 50, 70, 80, 0, 70, 12000, 6, 0, 0, 40,
50, 0, 0, 0, 50)), row.names = c(NA, -16L), class = "data.frame")
Threshold <- 0.1
data$EF_Outlier <- ifelse(data[data$Year == data$Firstyear, "Sum_Ouput"]/data[data$Year == data$Secondyear, "Sum_Ouput"] < Threshold, 1, 0)
Solution 1:[1]
Maybe this is what the question really wants.
suppressPackageStartupMessages(library(dplyr))
divide <- function(x) x/dplyr::lead(x, default = NA_real_)
Threshold <- 0.1
data %>%
arrange(ID, Year) %>%
group_by(ID) %>%
mutate(EF_Outlier = ifelse(Year == Firstyear, divide(Sum_Ouput), Inf),
EF_Outlier = as.integer(abs(EF_Outlier) < Threshold))
#> # A tibble: 16 × 7
#> # Groups: ID [4]
#> ID Year Input Firstyear Secondyear Sum_Ouput EF_Outlier
#> <chr> <dbl> <dbl> <dbl> <chr> <dbl> <int>
#> 1 Program A 2019 0 2020 2021 0 0
#> 2 Program A 2020 14 2020 2021 50 0
#> 3 Program A 2021 10 2020 2021 70 0
#> 4 Program A 2022 7 2020 2021 80 0
#> 5 Program B 2019 0 2020 2021 0 0
#> 6 Program B 2020 11 2020 2021 70 1
#> 7 Program B 2021 20 2020 2021 12000 0
#> 8 Program B 2022 6 2020 2021 6 0
#> 9 Program C 2019 0 2021 2022 0 0
#> 10 Program C 2020 0 2021 2022 0 0
#> 11 Program C 2021 3 2021 2022 40 0
#> 12 Program C 2022 6 2021 2022 50 0
#> 13 Program D 2019 0 2022 2022 0 0
#> 14 Program D 2020 0 2022 2022 0 0
#> 15 Program D 2021 0 2022 2022 0 0
#> 16 Program D 2022 6 2022 NA 50 NA
Created on 2022-05-23 by the reprex package (v2.0.1)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
