'Linking a SelectId to a Random Forest Model in Shiny
I am relativity new to R, and even newer to using Shiny, and I am having trouble with linking a selectInput to a randomForest model within Shiny.
I have created a RandomForest model that predicts insurance cost to a customer, and it looks to be performing well. I want to be able to base this model in Shiny, and allow users to change the risk information via sliders and drop downs so that the cost value updates accordingly. This works perfectly with numerical fields, but once I add a selectImput list (a list of parking values input = "garage"), I get the following error;
Warning: Error in predict.randomForest: New factor levels not present in the training data
Stack trace (innermost first):
85: predict.randomForest
84: predict
83: pred [#10]
82: renderText [#2]
81: func
80: origRenderFunc
79: output$guess
4: <Anonymous>
3: do.call
2: print.shiny.appobj
1: <Promise>
I assumed that a value in the drop down list was not in the model, so I went into the RandomForest object to select the actual values and place them in the code.
> rf$forest$xlevels$Parking
[1] "Driveway" "Locked garage" "On the road at home" "On the road away from home"
[5] "Other" "Residential car park" "Work car park"
Again the same error came back. The data class for Parking in the RF model is Factor. The Parking values are linked to the input = "garage".
Please see a copy of my code below.
library(shiny)
library(randomForest)
library(datasets)
ui <- fluidPage( titlePanel("Van Market Premium - alpha"),
checkboxInput(inputId = "comp", label = "Comprehensive"),
sliderInput(inputId = "age", label = "Age of Driver", value = 25, min = 17, max = 100),
sliderInput(inputId = "ncd", label = "No Claims Discount", value = 0, min = 0, max = 9),
numericInput(inputId = "cc", label = "CC", value = 1600, min = 250, max = 5000),
sliderInput(inputId = "value", label = "Current Van Value", value = 2000, min = 50, max = 20000, step = 250),
sliderInput(inputId = "aov", label = "Age of Van [years]", value = 5, min = 0, max = 50),
numericInput(inputId = "volxs", label = "Voluntary Excess", value = 0, min = 0, max = 1500),
sliderInput(inputId = "mileage", label = "Annual Mileage", value = 5000, min = 1000, max = 50000, step = 1000),
sliderInput(inputId = "length", label = "Ownership Length", value = 12, min = 0, max = 120, step = 6),
checkboxInput(inputId = "fuel", label = "Petrol?"),
checkboxInput(inputId = "auto", label = "Automatic?"),
selectInput(input = "garage", label = "Overnight Location", choices = as.factor(c("On the road at home",
"Driveway",
"Locked garage",
"Other",
"Residential car park",
"Work car park",
"On the road away from home"))),
textOutput("guess")
)
RF <- get(load("C:/Users//Documents/R/RF3.RData"))
pred <- function(co, ag, nc, cc, val, aov, vol, mil, len, fuel, auto, garage) {
inputdata <- c(co, ag, nc, cc, val, aov, vol, mil, len, fuel, auto, garage)
pred_data <- as.data.frame(t(inputdata))
colnames(pred_data) <- c("Comp" , "Age" , "NCD" , "CC" , "Value" , "AgeOfVehicle", "VoluntaryExcess"
,"AnnualMileage", "LengthOwned", "petroldiesel", "auto", "Parking")
prob_out <- predict(RF, pred_data)
prob_out <- exp(prob_out)
return(prob_out)
}
server <- function(input, output) {
output$guess <- renderText({pred(input$comp, input$age, input$ncd, input$cc, input$value, input$aov,
input$volxs, input$mileage, input$length, input$fuel, input$auto, input$garage
)})
}
shinyApp(ui = ui, server = server)
The code works perfectly without the garage sections. I feel it's something to do with the data formats, but I am really struggling to sort this out.
Solution 1:[1]
I can't say for sure since we don't have access to your model, but I'm somewhat confident that your character variables should be converted to factor to use in randomForest.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | TylerH |
