'Expand a time series hours into seconds in R
I have a data frame with a timestamp and value on the hour. I would like to expand the table so that there is a new row for every second within the hour.
This is the data frame that I currently have:
time value
2020-01-01 00:00:00 1.5
2020-01-01 01:00:00 5.2
2020-01-01 02:00:00 1.9
2020-01-01 03:00:00 2.2
2020-01-01 04:00:00 3.3
This is what I would like to end up with (abbreviated for simplicity):
time value
2020-01-01 00:00:00 1.5
2020-01-01 00:00:01 1.5
2020-01-01 00:00:02 1.5
2020-01-01 00:00:03 1.5
...
2020-01-01 00:59:58 1.5
2020-01-01 00:59:59 1.5
2020-01-01 01:00:00 5.2
2020-01-01 01:00:01 5.2
2020-01-01 01:00:02 5.2
...
2020-01-01 01:59:58 5.2
2020-01-01 01:59:59 5.2
2020-01-01 02:00:00 1.9
2020-01-01 02:00:01 1.9
2020-01-01 02:00:02 1.9
...
2020-01-01 02:59:58 1.9
2020-01-01 02:59:59 1.9
2020-01-01 03:00:00 2.2
2020-01-01 03:00:01 2.2
2020-01-01 03:00:02 2.2
...
2020-01-01 03:59:58 2.2
2020-01-01 03:59:59 2.2
2020-01-01 04:00:00 3.3
2020-01-01 04:00:01 3.3
2020-01-01 04:00:02 3.3
...
2020-01-01 04:59:58 3.3
2020-01-01 04:59:59 3.3
Any suggestions is greatly appreciated!
Solution 1:[1]
You can use complete
and fill
:
library(tidyr)
df %>%
complete(date = seq(from = min(date), to = max(date), by = 1)) %>%
fill(value)
# A tibble: 3,601 x 2
date value
<dttm> <dbl>
1 2020-01-01 00:00:00 1.5
2 2020-01-01 00:00:01 1.5
3 2020-01-01 00:00:02 1.5
4 2020-01-01 00:00:03 1.5
5 2020-01-01 00:00:04 1.5
6 2020-01-01 00:00:05 1.5
7 2020-01-01 00:00:06 1.5
8 2020-01-01 00:00:07 1.5
9 2020-01-01 00:00:08 1.5
10 2020-01-01 00:00:09 1.5
# ... with 3,591 more rows
data
date = as.POSIXct(c("2020-01-01 00:00:00", "2020-01-01 01:00:00", "2020-01-01 02:00:00"))
value = c(1.5, 5.2, 6)
df <- data.frame(date = date, value = value)
If one wants to expand for the final date/hour in the dataframe (credits to @caldwellst)
library(tidyr)
library(lubridate)
df %>%
complete(date = seq(from = min(date), to = max(date) + hours(1) - seconds(1), by = 1)) %>%
fill(value)
Solution 2:[2]
Through a sequence of the following should do it
data <- structure(
list(
time = c("01/01/2020 00:00:00", "01/01/2020 01:00:00", "01/01/2020 02:00:00", "01/01/2020 03:00:00", "01/01/2020 04:00:00"),
value = c(1.5, 5.2, 1.9, 2.2, 3.3)
),
row.names = c(NA, -5L),
class = "data.frame"
)
library(tidyverse)
data %>%
mutate(time = lubridate::dmy_hms(time)) %>%
complete(time = seq.POSIXt(min(time), max(time), by="sec")) %>%
fill(value)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 | Quixotic22 |