'I am having trouble with plotting this logistic regression model

Please help me with plotting this model. I tried just using the plot function but I'm not sure how to incorprate the testing dataset. Please help/Thank You.

TravelInsurance <- read.csv(file="TravelInsurancePrediction.csv",header=TRUE)
set.seed(2022)
Training <- sample(c(1:1987),1500,replace=FALSE)
Test <- c(1:1987)[-Training]
TrainData <- TravelInsurance[Training,]
TestData <- TravelInsurance[Test,]

TravIns=as.factor(TravelInsurance$TravelInsurance)
years= TravelInsurance$Age
EMPTY=as.factor(TravelInsurance$Employment.Type)
Grad=as.factor(TravelInsurance$GraduateOrNot)
Income=TravelInsurance$AnnualIncome
Fam=TravelInsurance$FamilyMembers
CD=as.factor(TravelInsurance$ChronicDiseases) 
FF=as.factor(TravelInsurance$FrequentFlyer)


logreg = glm(TravIns~ EMPTY+years+Grad+Income+Fam+CD+FF,family = binomial)


Solution 1:[1]

Too long for a comment.

Couple of things here:

  1. You divide your dataset into train and test but then build the model using the full dataset??
  2. Passing vectors is not a good way to use glm(...), or any of the R modeling functions. Better to pass the data frame and reference the columns in the formula.

So, with your dataset,

logreg <- glm(TravIns~ EMPTY+years+Grad+Income+Fam+CD+FF,family = binomial, data=TrainData)
pred   <- predict(logreg, newdata=TestData, type='response')

As this is a logistic regression, the responses are probabilities (that someone buys travel insurance?). There are several ways to assess goodness-of-fit. One visualization uses receiver operating characteristic (ROC) curves.

library(pROC)
roc(TestData$TravIns, pred, plot=TRUE)

The area under the roc curve (the "auc") is a measure of goodness of fit; 1.0 is prefect, 0.5 is no better than random chance. See the docs: ?roc and ?auc

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 jlhoward