'I am having trouble with plotting this logistic regression model
Please help me with plotting this model. I tried just using the plot function but I'm not sure how to incorprate the testing dataset. Please help/Thank You.
TravelInsurance <- read.csv(file="TravelInsurancePrediction.csv",header=TRUE)
set.seed(2022)
Training <- sample(c(1:1987),1500,replace=FALSE)
Test <- c(1:1987)[-Training]
TrainData <- TravelInsurance[Training,]
TestData <- TravelInsurance[Test,]
TravIns=as.factor(TravelInsurance$TravelInsurance)
years= TravelInsurance$Age
EMPTY=as.factor(TravelInsurance$Employment.Type)
Grad=as.factor(TravelInsurance$GraduateOrNot)
Income=TravelInsurance$AnnualIncome
Fam=TravelInsurance$FamilyMembers
CD=as.factor(TravelInsurance$ChronicDiseases)
FF=as.factor(TravelInsurance$FrequentFlyer)
logreg = glm(TravIns~ EMPTY+years+Grad+Income+Fam+CD+FF,family = binomial)
Solution 1:[1]
Too long for a comment.
Couple of things here:
- You divide your dataset into train and test but then build the model using the full dataset??
- Passing vectors is not a good way to use
glm(...), or any of the R modeling functions. Better to pass the data frame and reference the columns in the formula.
So, with your dataset,
logreg <- glm(TravIns~ EMPTY+years+Grad+Income+Fam+CD+FF,family = binomial, data=TrainData)
pred <- predict(logreg, newdata=TestData, type='response')
As this is a logistic regression, the responses are probabilities (that someone buys travel insurance?). There are several ways to assess goodness-of-fit. One visualization uses receiver operating characteristic (ROC) curves.
library(pROC)
roc(TestData$TravIns, pred, plot=TRUE)
The area under the roc curve (the "auc") is a measure of goodness of fit; 1.0 is prefect, 0.5 is no better than random chance. See the docs: ?roc and ?auc
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | jlhoward |
