'Prediction Dataframe with Multiple Factors

I am try to create a prediction data frame for a model that I ran that has 2 factor and a continuous variable. The data frame I want to create to plot model predictions for the first factor is given below:

Preds.Month = data.frame(Month = factor(1:12), 
                         VegeType = factor(1:12), 
                         DistAgriLand = median(a$DistAgriLand, na.rm = TRUE))

But I get this message: Error in model.frame.default(Terms, newdata, na.action = na.action, xlev = object$xlevels) : factor VegeType has new levels 6

However, if I remove the factor VegeType from the model, re-fit the model and try to create the prediction data frame it works fine. I am not sure what the error is and how to resolve it, any help would be greatly appreciated. I do know that despite VegeType having 12 levels, only 5 have data in them if this plays apart in the error.

Here is some sample data:

a = structure(list(Month = structure(c(9L, 9L, 9L, 9L, 9L, 9L, 9L, 
9L, 9L, 9L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 
11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 12L, 
12L, 12L, 12L, 12L, 3L, 4L, 6L, 6L, 8L, 8L, 10L, 10L, 12L, 12L, 
3L, 3L, 3L, 6L, 6L, 10L, 10L, 3L, 3L, 3L, 6L, 6L, 10L, 10L, 3L, 
6L, 6L, 10L, 10L, 3L, 6L, 6L, 10L, 10L, 3L, 4L, 6L, 6L, 8L, 8L, 
10L, 10L, 12L, 12L, 3L, 4L, 6L, 6L, 8L, 8L), .Label = c("1", 
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12"), class = "factor"), 
    VegeType = structure(c(6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 
    6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 
    6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 
    6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 
    6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 
    6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 
    6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 
    6L), .Label = c("1", "2", "3", "4", "5", "7", "8", "9", "10", 
    "11", "12"), class = "factor"), DistAgriLand = c(580.5, 580.5, 
    580.5, 580.5, 580.5, 580.5, 580.5, 580.5, 580.5, 580.5, 580.5, 
    580.5, 580.5, 580.5, 580.5, 580.5, 580.5, 580.5, 580.5, 580.5, 
    580.5, 580.5, 580.5, 580.5, 580.5, 580.5, 580.5, 580.5, 580.5, 
    580.5, 580.5, 580.5, 580.5, 580.5, 580.5, 580.5, 580.5, 580.5, 
    580.5, 580.5, 580.5, 580.5, 580.5, 580.5, 580.5, 580.5, 580.5, 
    580.5, 580.5, 580.5, 594.37, 594.37, 594.37, 594.37, 594.37, 
    594.37, 594.37, 594.37, 594.37, 594.37, 594.37, 594.37, 594.37, 
    594.37, 594.37, 594.37, 594.37, 594.37, 594.37, 594.37, 594.37, 
    594.37, 594.37, 594.37, 594.37, 594.37, 594.37, 594.37, 594.37, 
    594.37, 594.37, 594.37, 594.37, 594.37, 594.37, 594.37, 594.37, 
    594.37, 594.37, 594.37, 594.37, 594.37, 594.37, 594.37, 594.37, 
    594.37, 594.37, 594.37, 594.37, 594.37)), row.names = c(NA, 
100L), class = "data.frame")
r


Solution 1:[1]

Try adding the missing level to VegeType:

Preds.Month = data.frame(Month = factor(1:12), 
                         VegeType = c(levels(a$VegeType),6), 
                         DistAgriLand = median(a$DistAgriLand, na.rm = TRUE))

Output:

   Month VegeType DistAgriLand
1      1        1      587.435
2      2        2      587.435
3      3        3      587.435
4      4        4      587.435
5      5        5      587.435
6      6        7      587.435
7      7        8      587.435
8      8        9      587.435
9      9       10      587.435
10    10       11      587.435
11    11       12      587.435
12    12        6      587.435

However, the better approach is to make sure that when you create vegeType in the first place, you have level 6 in there. i.e. levels=(1:12), whereas currently in your provided structure, you have all values are "6", but the .Label = c(1,2,3,4,5,7,8,9,10,11,12). Did you intend to label VegeType=6 with "7"?

Finally, if you want to predict for all levels of month and vegetype, you can do this

Preds.Month = setNames(
  cbind(
    expand.grid(1:12, 1:12),
    median(a$DistAgriLand,na.rm=T)
  ), c("Month", "VegeType", "DistAgriLand")
)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1