'How to use cross validation with sm.GLM (sm = statsmodels.api)?

How to use cross validation with sm.GLM (sm = statsmodels.api)?

I am trying to fill the "model" parameter from the cross_val_score. However, since I need to use the sm.GLM I don't know how to use it.

X_train, X_test, y_train, y_test = train_test_split(X, y,test_size=0.20, random_state = 1)

n = 5

df_cut, bins = pd.cut(X_train, n, retbins=True, right=True)
df_steps = pd.concat([X_train, df_cut, y_train], keys=['age','age_cuts','wage'], axis=1)

# Create dummy variables for the age groups
df_steps_dummies = pd.get_dummies(df_cut) 

# Fitting Generalised linear models
fit3 = sm.GLM(df_steps.wage, df_steps_dummies).fit() 

# Binning validation set into same 5 bins
bin_mapping = np.digitize(X_test, bins, right=True) 
X_valid = pd.get_dummies(bin_mapping)

# Removing any outliers
# X_valid = pd.get_dummies(bin_mapping).drop([6], axis=1)# Prediction

pred2 = fit3.predict(X_valid)

# Calculating RMSE
rms = sqrt(mean_squared_error(y_test, pred2)) 
print(rms) 

scores = cross_val_score(model, X, y, scoring='neg_mean_squared_error',
                         cv=cv, n_jobs=-1)

Therefore, instead of computing the RMSE like above, I would like to use the cross_val_score. For example, in the model parameter, if I would like to use lasso I would put model = lasso(). However, here I can not put model = sm.GLM().

I hope it is clear...

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'How to use cross validation with sm.GLM (sm = statsmodels.api)?

Sources

Related Questions