'How to create a time series using my own time data?
I want to create a time series where I extract the time column from my data set and start the series from the earliest instance
I have a dataset that looks something like this:
Entity Year Rate
a 1900 x
a 1901 x
a 1902 x
b 1875 x
a 1876 x
a 1877 x
c 1980 x
c 1981 x
c 1982 x
c 1983 x

I have divided the dataset into subsets filtered by entity. I want to create a timeseries for entity a starting at the year 1900. All I know to do is
tsA <- ts(subsetA, start = 1900, frequency = 1)
when creating the subset or the timeseries, is there a way to get R to recognize the "year" column and run the time series through the dates in the year column for that entity?
Solution 1:[1]
If you want to use timeseries, you could go with fable and tsibble. Fable is the successor of forecast.
#fable loads tsibble
library(fable)
library(ggplot2)
# index = timeseries, key = data to group by
my_tsibble <- as_tsibble(df1, index = Year, key = Entity)
#ggplot2 plotting
autoplot(my_tsibble) + geom_point()
data:
df1 <- structure(list(Entity = c("a", "a", "a", "b", "a", "a", "c",
"c", "c", "c"), Year = c(1900L, 1901L, 1902L, 1875L, 1876L, 1877L,
1980L, 1981L, 1982L, 1983L), Rate = c(0.336955619277433, 0.626354965148494,
0.540716192685068, 0.743173609254882, 0.290504944045097, 0.266880671493709,
0.770237174350768, 0.164355911314487, 0.753349485108629, 0.900830976199359
)), row.names = c(NA, -10L), class = "data.frame")
Solution 2:[2]
Suppose we have the data frame shown reproducibly in the Note at the end. Then we can read it into a zoo object and possibly to various other forms. The ts object will fill in empty years with NA but the zoo object can represent irregularly spaced series so it does not need to do that.
library(zoo)
z <- read.zoo(DF, index = 2, split = 1); z
# just part that starts at 1900
window(z, start = 1900)
# as a ts series
tt <- as.ts(z)
# as a wide data.frame
fortify.zoo(z)
# as a long data.frame
fortify.zoo(z, melt = TRUE)
# same but without NAs
na.omit(fortify.zoo(z, melt = TRUE))
# plot - omit facet=NULL to get separate panels
library(ggplot2)
autoplot(z, facet = NULL, geom = "point") + geom_line()
# plot lines only without points
autoplot(z, facet = NULL)
# using data frames
DF1900 <- subset(DF, Year >= 1900)
split(DF1900, DF1900$Entity)
Note
Lines <- "Entity Year Rate
a 1900 x
a 1901 x
a 1902 x
b 1875 x
a 1876 x
a 1877 x
c 1980 x
c 1981 x
c 1982 x
c 1983 x"
DF <- read.table(text = Lines, header = TRUE)
DF$Rate <- 1:nrow(DF)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | phiver |
| Solution 2 |

