'New to R working on first capstone data project I used difftime but no calculations are going to new column

I'm trying to do my first data capstone project. I imported the csv files and added them into one frame with no problems. I was able to remove a few columns by using.

 all_trip <- all_trip %>%
  select(-c(start_lat, start_lng, end_lat, end_lng))

I am using the past 12 months from diffy-tripdata. https://divvy-tripdata.s3.amazonaws.com/index.html edit* I was trying to separate the columns but someone suggested using difftime. Now I have added this.

all_trips$ride_length <- as.difftime(all_trips$ended_at, all_trips$started_at, units = "mins")

It created the new column but I'm getting NA. I suspect it is because the data types are chr. Haven't found a way to change the datatype and not lose the data. Still looking. Any help is appreciated.

I was asked to edit and add dput(head(add_trips))

dput(head(all_trips))
structure(list(ride_id = c("3564070EEFD12711", "0B820C7FCF22F489", 
"89EEEE32293F07FF", "84D4751AEB31888D", "5664BCF0D1DE7A8B", "AA9EB7BD2E1FC128"
), rideable_type = c("electric_bike", "classic_bike", "classic_bike", 
"classic_bike", "electric_bike", "classic_bike"), started_at = c("4/6/2022 17:42", 
"4/24/2022 19:23", "4/20/2022 19:29", "4/22/2022 21:14", "4/16/2022 15:56", 
"4/21/2022 16:52"), ended_at = c("4/6/2022 17:54", "4/24/2022 19:43", 
"4/20/2022 19:35", "4/22/2022 21:23", "4/16/2022 16:02", "4/21/2022 16:56"
), start_station_name = c("Paulina St & Howard St", "Wentworth Ave & Cermak Rd", 
"Halsted St & Polk St", "Wentworth Ave & Cermak Rd", "Halsted St & Polk St", 
"Desplaines St & Randolph St"), start_station_id = c("515", "13075", 
"TA1307000121", "13075", "TA1307000121", "15535"), end_station_name = c("University Library (NU)", 
"Green St & Madison St", "Green St & Madison St", "Delano Ct & Roosevelt Rd", 
"Clinton St & Madison St", "Canal St & Adams St"), end_station_id = c("605", 
"TA1307000120", "TA1307000120", "KA1706005007", "TA1305000032", 
"13011"), member_casual = c("member", "member", "member", "casual", 
"member", "member"), ride_length = structure(c(NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_), class = "difftime", units = "mins")), row.names = c(NA, 
6L), class = "data.frame")


Solution 1:[1]

Sounds like this is an issue with getting the started_at and ended_at data into the right format. The function as.POSIXlt() can help here:

all_trips$started_at <- as.POSIXlt(all_trips$started_at, format = "%m/%d/%Y %H:%M", tz="EST")
all_trips$ended_at <- as.POSIXlt(all_trips$ended_at, format = "%m/%d/%Y %H:%M", tz="EST")
all_trips$ride_length <- difftime(all_trips$ended_at, all_trips$started_at)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1