'How can I plot "Arrival time" in my Shiny app? There are duplicates of each "Id" and "Arrival time"

I have chosen to use a long dataset for the plot. The variable "Arrival Time" means when somebody arrives at work. The variable "Mode" means the preferred mode of transport. However, the preferred mode of transport can be more than one. For example, some participants have chosen "Yes" to both "Car" and "Bicycle".

I need to plot the "Arrival time", which should be unique for every value of "Id".

I need to be able to filter the different modes of transport in my sidebarPanel.

I need to be able to choose all modes of transport

library(shiny)
library(ggplot2)
library(dplyr)
library(tidyverse)
library(readxl)
library(dplyr)
library(do)
library(shinyWidgets)

#1. READ DATA FROM EXCEL
Survey <- read_excel("~/EIGSI/Project/Work files for R studio/06_Participant.xlsx")


#2.SUBSET DATA
Data <- subset(Survey, select = c(1,3,5:14))

colnames(Data) <- c(
  'Id',
  'Arrival time',
  'On foot',
  'Bicycle',
  'Bicycle (Yélo)',
  'Motorcycle/scooter',
  'Scooter (trotinette)',
  'Bus',
  'Train',
  'Car',
  'Carpool',
  'Car (Yélo)'
)

#3.GATHER DATA FOR LONG TABLE
Data <- Data %>% 
  gather(key = "Mode",
         value = "Answer",
         -Id, -`Arrival time`)



#4. OPTIONAL: ORDER THE X-AXIS OF THE PLOT
Data$`Arrival time` = ordered(Data$`Arrival time`, levels = c(
  "before 7h00",
  "7h00 - 7h30",
  "7h30 - 8h00",
  "8h00 - 8h30",
  "8h30 - 9h00",
  "9h00 - 9h30",
  "after 9h30"
  
))


#5. MAKE VECTOR OF CHOICES
a <- c(unique(Data$Mode))


#6. CONFIGURE THE SHINY APPLICATION
ui <- fluidPage(
  titlePanel("Arrival times"),  # Add a title panel
  sidebarLayout(  
    position = "right",
    
    sidebarPanel(h3("Inputs for histogram"),
                 pickerInput("Mode", "Select mode", choices = a, options = list(`actions-box` = TRUE),multiple = T),
                 br()
    ),
    
    
    # Inside the sidebarLayout, add a sidebarPanel
    mainPanel(
      plotOutput("myhist")
      
    )  
  )
)

# server.R ----
server <- function(input, output) {
  
  output$myhist <- renderPlot({
    
    observe({
      print(input$Mode) #ENABLES ME TO CHOOSE ALL VARIABLES
    })

    
  Data %>% 
      ggplot(aes(x=`Arrival time`))+
      geom_histogram(stat = "count", data = Data[Data$Mode==input$Mode ,])
    
  })
  
}

# Run the app ----
shinyApp(ui = ui, server = server)

This is an example of the dataset called Data, I show it using the dput() function:

     structure(list(Id = c(1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4), `Arrival time` = structure(c(3L, 
6L, 4L, 5L, 3L, 6L, 4L, 5L, 3L, 6L, 4L, 5L), .Label = c("before 7h00", 
"7h00 - 7h30", "7h30 - 8h00", "8h00 - 8h30", "8h30 - 9h00", "9h00 - 9h30", 
"after 9h30"), class = c("ordered", "factor")), Mode = c("On foot", 
"On foot", "On foot", "On foot", "Scooter (trotinette)", "Scooter (trotinette)", 
"Scooter (trotinette)", "Scooter (trotinette)", "Carpool", "Carpool", 
"Carpool", "Carpool"), Answer = c("Yes", "No", "No", "No", "No", 
"No", "No", "No", "No", "No", "No", "No")), row.names = c(1L, 
2L, 3L, 4L, 9157L, 9158L, 9159L, 9160L, 18313L, 18314L, 18315L, 
18316L), class = "data.frame")


Solution 1:[1]

So just to clarify: What you want to do is first select the rows where the mode is one of the chosen modes and the answer is yes, and then plot the arrival times of those rows.

First, you can subset the dataset so it's only the modes you want and the answers "Yes". Now you have only the rows you want, but like you pointed out, some rows are duplicated.

Next, you drop the Mode and Answer columns and set Data to unique(Data). Dropping those two columns means you only have the ID and Arrival Time columns, and running the unique() will drop the duplicate rows and leave you with one row per ID (assuming each ID only has one arrival time).

One note, in your filter, you do this: data = Data[Data$Mode==input$Mode

This is not what you want because of two reasons:

  1. It will take all rows that have the input Mode, even if the answer is No. You want to add an additional filter that the answer has to be Yes.
  2. It will not work with multiple selections. Instead of ==, use %in%. This way, even if there are multiple selections, the filter will still work.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Aqeel Padaria