'Grouping data by quarter but adding a cut off date to account for zero values

I have the following dataset that are being grouped by quarter. The issue is that after Q3 2021 i.e. in Q4 2021 and Q1 2022 there hasn't been any cases so no dates in original data to look at and group them and put zero for them. Is there a way to put a cut off date when grouping data so it would account for such quarters? dput(test_date) is as follow:

structure(list(X = 1:31, test_quarter = c("2014 Q2", "2014 Q3", 
"2014 Q4", "2015 Q1", "2015 Q2", "2015 Q3", "2015 Q4", "2016 Q1", 
"2016 Q2", "2016 Q3", "2016 Q4", "2017 Q1", "2017 Q2", "2017 Q3", 
"2017 Q4", "2018 Q1", "2018 Q2", "2018 Q3", "2018 Q4", "2019 Q1", 
"2019 Q2", "2019 Q3", "2019 Q4", "2020 Q1", "2020 Q2", "2020 Q3", 
"2020 Q4", "2021 Q1", "2021 Q2", "2021 Q3", "2021 Q4"), cases = c(1L, 
0L, 0L, 0L, 0L, 0L, 38L, 4L, 2L, 8L, 9L, 13L, 6L, 20L, 32L, 42L, 
26L, 18L, 25L, 11L, 5L, 4L, 4L, 3L, 1L, 2L, 2L, 2L, 3L, 0L, 1L
)), class = "data.frame", row.names = c(NA, -31L))


library(dplyr)
dataset %>% 
  janitor::clean_names() %>% 
  mutate(test_date = lubridate::dmy(test_date),
         test_quarter = zoo::as.yearqtr(test_date)+0.25) %>% # 0.25 added to make the code put the end day of quarter instead of first day of quarter##
  count(test_quarter, name = 'cases') %>%
  tidyr::complete(test_quarter = seq(min(test_quarter), max(test_quarter), 1/4), 
                  fill = list(cases = 0))

This is how the data look-like after grouping. The only way I can think of is to ask the code to consider the end date as end of March 2022 while the recoded data are till July 2017:

enter image description here

any help is very much appreciated.



Solution 1:[1]

We assume that the question is how to sum the cases by year and quarter extending it by two quarters with zero cases.

Convert dataset to a zoo object z aggregating cases and then use merge.zoo to append two zero cases giving zoo object cases. Either use cases as is or else convert it back to data frame using fortify.zoo.

(If the input were the ending year and quarter to extend to rather than the number of quarters to extend by we could replace end(z) + 1:2/4 with seq(end(z) + 1/4, as.yearqtr("2022 Q2"), 1/4) .)

library(zoo)

z <- read.zoo(dataset[-1], FUN = as.yearqtr, aggregate = sum)
cases <- merge(z, zoo(, end(z) + 1:2/4), fill = 0)

fortify.zoo(cases, name = "yearqtr")

giving:

   yearqtr cases
1  2014 Q2     1
2  2014 Q3     0
3  2014 Q4     0
4  2015 Q1     0
5  2015 Q2     0
6  2015 Q3     0
7  2015 Q4    38
8  2016 Q1     4
9  2016 Q2     2
10 2016 Q3     8
11 2016 Q4     9
12 2017 Q1    13
13 2017 Q2     6
14 2017 Q3    20
15 2017 Q4    32
16 2018 Q1    42
17 2018 Q2    26
18 2018 Q3    18
19 2018 Q4    25
20 2019 Q1    11
21 2019 Q2     5
22 2019 Q3     4
23 2019 Q4     4
24 2020 Q1     3
25 2020 Q2     1
26 2020 Q3     2
27 2020 Q4     2
28 2021 Q1     2
29 2021 Q2     3
30 2021 Q3     0
31 2021 Q4     1
32 2022 Q1     0
33 2022 Q2     0

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1