'Value of one variable, where two other variables are equal in R

I am still quite new to R and I have a simple (I guess) question: I would like to divide the value in one column (variable Sum_Output) in the first year of a program by the value in the same column in the second year of a program.
I have identified the first and the second year. Now I fail to tell R

"Use the value of column 6 where Year = Firstyear and divide it by the value of column 6 where Year=Secondyear."

I tried the following approach, but get the error, that

longer object length is not a multiple of shorter object length.

So I guess, this operation does not achieve to only calculate the ratio for those rows, where the condition is fulfilled. How could I include this in my approach or do you have an easier way to calculate the ratio of sum_outputs in the first and the second year?

Thanks a lot for your time and ideas how to do this!

data <- structure(list(ID = c("Program A", "Program A", "Program A", 
                              "Program A", "Program B", "Program B", "Program B", "Program B", 
                              "Program C", "Program C", "Program C", "Program C", "Program D", 
                              "Program D", "Program D", "Program D"), Year = c(2019, 2020, 
                                                                                  2021, 2022, 2019, 2020, 2021, 2022, 2019, 2020, 2021, 2022, 2019, 
                                                                                  2020, 2021, 2022), Input = c(0, 14, 10, 7, 0, 11, 20, 6, 0, 0, 
                                                                                                               3, 6, 0, 0, 0, 6), Firstyear = c(2020, 2020, 2020, 2020, 2020, 
                                                                                                                                                2020, 2020, 2020, 2021, 2021, 2021, 2021, 2022, 2022, 2022, 2022
                                                                                                               ), Secondyear = c(2021, 2021, 2021, 2021, 2021, 2021, 
                                                                                                                                 2021, 2021, 2022, 2022, 2022, 2022, 2022, 2022, 2022, 
                                                                                                                                 "NA"), Sum_Ouput = c(0, 50, 70, 80, 0, 70, 12000, 6, 0, 0, 40, 
                                                                                                                                                      50, 0, 0, 0, 50)), row.names = c(NA, -16L), class = "data.frame")

Threshold <- 0.1
data$EF_Outlier <- ifelse(data[data$Year == data$Firstyear, "Sum_Ouput"]/data[data$Year == data$Secondyear, "Sum_Ouput"] < Threshold,  1, 0)


Solution 1:[1]

Maybe this is what the question really wants.

suppressPackageStartupMessages(library(dplyr))

divide <- function(x) x/dplyr::lead(x, default = NA_real_)

Threshold <- 0.1

data %>%
  arrange(ID, Year) %>%
  group_by(ID) %>%
  mutate(EF_Outlier = ifelse(Year == Firstyear, divide(Sum_Ouput), Inf),
         EF_Outlier = as.integer(abs(EF_Outlier) < Threshold))
#> # A tibble: 16 × 7
#> # Groups:   ID [4]
#>    ID         Year Input Firstyear Secondyear Sum_Ouput EF_Outlier
#>    <chr>     <dbl> <dbl>     <dbl> <chr>          <dbl>      <int>
#>  1 Program A  2019     0      2020 2021               0          0
#>  2 Program A  2020    14      2020 2021              50          0
#>  3 Program A  2021    10      2020 2021              70          0
#>  4 Program A  2022     7      2020 2021              80          0
#>  5 Program B  2019     0      2020 2021               0          0
#>  6 Program B  2020    11      2020 2021              70          1
#>  7 Program B  2021    20      2020 2021           12000          0
#>  8 Program B  2022     6      2020 2021               6          0
#>  9 Program C  2019     0      2021 2022               0          0
#> 10 Program C  2020     0      2021 2022               0          0
#> 11 Program C  2021     3      2021 2022              40          0
#> 12 Program C  2022     6      2021 2022              50          0
#> 13 Program D  2019     0      2022 2022               0          0
#> 14 Program D  2020     0      2022 2022               0          0
#> 15 Program D  2021     0      2022 2022               0          0
#> 16 Program D  2022     6      2022 NA                50         NA

Created on 2022-05-23 by the reprex package (v2.0.1)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1