'How to use all variables for Logistic Regression in Python from Statsmodel (equivalent to R glm)

I would like to conduct Logistic Regression in Python.

My reference in R is

model_1 <- glm(status_1 ~., data = X_train, family=binomial)
summary(model_1)

I'm trying to convert this into Python. But not so sure how to grab all variables.

import statsmodels.api as sm
model = sm.formula.glm("status_1 ~ ", family=sm.families.Binomial(), data=train).fit()
print(model.summary())

How can I use all variables, which means what do I need to input after status_1?



Solution 1:[1]

statsmodels makes it pretty straightforward to do logistic regression, as such:

import statsmodels.api as sm

Xtrain = df[['gmat', 'gpa', 'work_experience']]
ytrain = df[['admitted']]

log_reg = sm.Logit(ytrain, Xtrain).fit()

Where gmat, gpa and work_experience are your independent variables.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1