'Create dataset by filtering 4 hour intervals from datetime

I wish to create a dataset that only has the data from every observation 4 hours apart. Currently I possess a dataset that has hourly observations. Dividing does not work because the samples would not have 4 hour intervals between them.

IndividId   DateTimeLMT          isDaylight    YEAR  month   day  hour
  <chr>     <dttm>                  <chr>      <dbl> <dbl> <dbl> <dbl>
1 M_15_03   2017-01-13 15:00:42     True        2017     1    13    15
2 M_15_03   2017-01-13 16:00:14     False       2017     1    13    16
3 M_15_03   2017-01-13 17:00:09     False       2017     1    13    17
4 M_15_03   2017-01-13 18:00:42     False       2017     1    13    18
5 M_15_03   2017-01-13 19:00:14     False       2017     1    13    19
6 M_15_03   2017-01-13 20:00:45     False       2017     1    13    20 

The result I would be looking for is a formula that would create something similar to this:

IndividId   DateTimeLMT          isDaylight    YEAR  month   day  hour
  <chr>     <dttm>                  <chr>      <dbl> <dbl> <dbl> <dbl>
1 M_15_03   2017-01-13 15:00:42     True        2017     1    13    15
2 M_15_03   2017-01-13 19:00:14     False       2017     1    13    19
3 M_15_03   2017-01-13 23:00:09     False       2017     1    13    23
4 M_15_03   2017-01-14 03:00:42     False       2017     1    14    03
5 M_15_03   2017-01-14 07:00:14     False       2017     1    14    07
6 M_15_03   2017-01-14 11:00:45     True        2017     1    14    11  


Solution 1:[1]

You can use lubridate::floor_date. floor_date set to 4 hours will round at 16 - 20 - 24 hours, so you can use lead to make to 15 - 19, and fill with tidyr::fill.

library(lubridate)
library(tidyr)
library(dplyr)

data %>% 
  mutate(fl = floor_date(lead(DateTimeLMT), unit = "4 hours")) %>% 
  fill(fl) %>% 
  group_by(IndividId, fl) %>% 
  summarise(across(c(DateTimeLMT, YEAR:hour), first),
            isDaylight = any(isDaylight == "True"))

# A tibble: 2 x 8
# Groups:   IndividId [1]
  IndividId fl                  DateTimeLMT          YEAR month   day  hour isDaylight
  <chr>     <dttm>              <dttm>              <int> <int> <int> <int> <lgl>     
1 M_15_03   2017-01-13 16:00:00 2017-01-13 15:00:42  2017     1    13    15 TRUE      
2 M_15_03   2017-01-13 20:00:00 2017-01-13 19:00:14  2017     1    13    19 FALSE     

data

data <- read.table(header = T, text = "IndividId   DateTimeLMT          isDaylight    YEAR  month   day  hour
1 M_15_03   2017-01-13-15:00:42     True        2017     1    13    15
2 M_15_03   2017-01-13-16:00:14     False       2017     1    13    16
3 M_15_03   2017-01-13-17:00:09     False       2017     1    13    17
4 M_15_03   2017-01-13-18:00:42     False       2017     1    13    18
5 M_15_03   2017-01-13-19:00:14     False       2017     1    13    19
6 M_15_03   2017-01-13-20:00:45     False       2017     1    13    20 ")

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1