'How to set levels for a vector where first 8 characters vary?

I have time data that I am attempting to change into factor variables "AM" and "PM"

Problem is the first 8 or so characters vary before getting to the am or pm. So how do I set these levels when this is the case? Here is my data

structure(c(7L, 1L, 9L, 11L, 13L, 15L, 17L, 19L, 21L, 23L, 3L, 
5L, 8L, 2L, 10L, 12L, 14L, 16L, 18L, 20L, 22L, 24L, 4L, 6L), .Label = c("4/12/2016 1:00:00 AM", 
"4/12/2016 1:00:00 PM", "4/12/2016 10:00:00 AM", "4/12/2016 10:00:00 PM", 
"4/12/2016 11:00:00 AM", "4/12/2016 11:00:00 PM", "4/12/2016 12:00:00 AM", 
"4/12/2016 12:00:00 PM", "4/12/2016 2:00:00 AM", "4/12/2016 2:00:00 PM", 
"4/12/2016 3:00:00 AM", "4/12/2016 3:00:00 PM", "4/12/2016 4:00:00 AM", 
"4/12/2016 4:00:00 PM", "4/12/2016 5:00:00 AM", "4/12/2016 5:00:00 PM", 
"4/12/2016 6:00:00 AM", "4/12/2016 6:00:00 PM", "4/12/2016 7:00:00 AM", 
"4/12/2016 7:00:00 PM", "4/12/2016 8:00:00 AM", "4/12/2016 8:00:00 PM", 
"4/12/2016 9:00:00 AM", "4/12/2016 9:00:00 PM"), class = "factor")

I am doing this so I can order the data on a graph.. average_intensities As you can see the data is currently ordered alphabetically and I would like it to be ordered by numerically alphabetically. Any help would be appreciated!! Thank you



Solution 1:[1]

I would not recommend using factors or characters for dates, because the information is really different and dates are handled quite nicely by ggplot2.

It would probably be better to use lubridate: in order to both parse the dates and extract meaningful information from them.

library(tidyverse)
dat <- tibble(
  date = lubridate::dmy_hms(
    c("4/12/2016 12:00:00 AM", "4/12/2016 1:00:00 AM", "4/12/2016 2:00:00 AM", 
      "4/12/2016 3:00:00 AM", "4/12/2016 4:00:00 AM", "4/12/2016 5:00:00 AM", 
      "4/12/2016 6:00:00 AM", "4/12/2016 7:00:00 AM", "4/12/2016 8:00:00 AM", 
      "4/12/2016 9:00:00 AM", "4/12/2016 10:00:00 AM", "4/12/2016 11:00:00 AM", 
      "4/12/2016 12:00:00 PM", "4/12/2016 1:00:00 PM", "4/12/2016 2:00:00 PM", 
      "4/12/2016 3:00:00 PM", "4/12/2016 4:00:00 PM", "4/12/2016 5:00:00 PM", 
      "4/12/2016 6:00:00 PM", "4/12/2016 7:00:00 PM", "4/12/2016 8:00:00 PM", 
      "4/12/2016 9:00:00 PM", "4/12/2016 10:00:00 PM", "4/12/2016 11:00:00 PM"
    )),
  am_or_pm = ifelse(lubridate::am(date), "AM", "PM"),
  x = rnorm(24))

What is extra nice with lubridate is that it will do a lot of work for you, like work out how years, months, days and times are specified in the character strings that you provide it. Also, t comes with a lot of useful functions, like am, that will tell you if it's morning or afternoon! That tibble could be then used in a plot:

dat %>% 
  ggplot(aes(date, x, color = am_or_pm)) + 
  geom_point() + 
  theme_bw()

that would look like this

Scatterplot with times

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1