'New to R working on first capstone data project I used difftime but no calculations are going to new column
I'm trying to do my first data capstone project. I imported the csv files and added them into one frame with no problems. I was able to remove a few columns by using.
all_trip <- all_trip %>%
select(-c(start_lat, start_lng, end_lat, end_lng))
I am using the past 12 months from diffy-tripdata. https://divvy-tripdata.s3.amazonaws.com/index.html edit* I was trying to separate the columns but someone suggested using difftime. Now I have added this.
all_trips$ride_length <- as.difftime(all_trips$ended_at, all_trips$started_at, units = "mins")
It created the new column but I'm getting NA. I suspect it is because the data types are chr. Haven't found a way to change the datatype and not lose the data. Still looking. Any help is appreciated.
I was asked to edit and add dput(head(add_trips))
dput(head(all_trips))
structure(list(ride_id = c("3564070EEFD12711", "0B820C7FCF22F489",
"89EEEE32293F07FF", "84D4751AEB31888D", "5664BCF0D1DE7A8B", "AA9EB7BD2E1FC128"
), rideable_type = c("electric_bike", "classic_bike", "classic_bike",
"classic_bike", "electric_bike", "classic_bike"), started_at = c("4/6/2022 17:42",
"4/24/2022 19:23", "4/20/2022 19:29", "4/22/2022 21:14", "4/16/2022 15:56",
"4/21/2022 16:52"), ended_at = c("4/6/2022 17:54", "4/24/2022 19:43",
"4/20/2022 19:35", "4/22/2022 21:23", "4/16/2022 16:02", "4/21/2022 16:56"
), start_station_name = c("Paulina St & Howard St", "Wentworth Ave & Cermak Rd",
"Halsted St & Polk St", "Wentworth Ave & Cermak Rd", "Halsted St & Polk St",
"Desplaines St & Randolph St"), start_station_id = c("515", "13075",
"TA1307000121", "13075", "TA1307000121", "15535"), end_station_name = c("University Library (NU)",
"Green St & Madison St", "Green St & Madison St", "Delano Ct & Roosevelt Rd",
"Clinton St & Madison St", "Canal St & Adams St"), end_station_id = c("605",
"TA1307000120", "TA1307000120", "KA1706005007", "TA1305000032",
"13011"), member_casual = c("member", "member", "member", "casual",
"member", "member"), ride_length = structure(c(NA_real_, NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_), class = "difftime", units = "mins")), row.names = c(NA,
6L), class = "data.frame")
Solution 1:[1]
Sounds like this is an issue with getting the started_at and ended_at data into the right format. The function as.POSIXlt() can help here:
all_trips$started_at <- as.POSIXlt(all_trips$started_at, format = "%m/%d/%Y %H:%M", tz="EST")
all_trips$ended_at <- as.POSIXlt(all_trips$ended_at, format = "%m/%d/%Y %H:%M", tz="EST")
all_trips$ride_length <- difftime(all_trips$ended_at, all_trips$started_at)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
