'left_join resulting in null values in R
I would like to combine data tables using a left_join by the 'Date' column. The resulting table merges the table, but the columns which joined to the data table are all NAs.
The statement used:
merge <- left_join(activity, sleep, by =c("Date" = "Date"))
Could this be an issue of Date formatting? The date format in 'activity' is %m/%d/%y and in 'sleep' is %m/%d/%y %I:%M:%S %p. The merged table's date format is %m/%d/%y.
Solution 1:[1]
Convert to "real"
Date-class andPOSIXt-class data first.This is the recommended approach unless immediately after this join you know that you will not be looking at timestamps ever again. That is, if you even need "time between activities" or "counts per day" or something similar, you must really go with this option.
First, convert all dates to
Dateand timestamps toPOSIXt. I'm going to guess at the fields here but include a placeholder for you to add more if you have them.library(dplyr) # replace `c(Date)` with more if you have more, e.g., `c(Date, Date2, Date3)` activity <- activity %>% mutate(across(c(Date), ~ as.Date(., format = "%m/%d/%y"))) # similarly, `c(Date)` sleep <- sleep %>% mutate( across(c(Date), ~ as.POSIXct(., format = "%m/%d/%y %I:%M:%S %p")), Date2 = as.Date(timestamp) ) merged <- left_join(activity, sleep, by = c("Date" = "Date2"))I created
Date2here because R does not compareDates equally withPOSIXtvariables. The rationale is unfortunately clear:Sys.Date() ; Sys.time() # [1] "2022-02-04" # [1] "2022-02-04 23:53:37 EST" as.numeric(Sys.Date()) ; as.numeric(Sys.time()) # [1] 19027 # [1] 1644036818By converting the
POSIXttoDatefixes that portion.Substrings.
Really, this is a hack, and only really useful if you intentionally will discard time/date from the rest of your analysis. (Even then, I feel hacky just recommending this ...)
sleep %>% mutate(Date2 = substring(Date, 1, 8)) %>% left_join(activity, ., by = c("Date" = "Date2"))
Just to be sure ... your data has 2-digit years, right? If not, then change "%y" to "%Y", and substring(Date, 1, 8) to substring(Date, 1, 10).
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | r2evans |
