'To fit Linear regression Model with and without intercept in python
I need to fit Linear regression Model 1 : y = β1x1 + ε and Model 2: y = β0 + β1x1 + ε, to the data x1 = ([0,1,2,3,4])
y = ([1,2,3,2,1]). My objective is to find
coefficients, squared error loss, the absolute error loss, and the L1.5 loss for both model.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn import linear_model
import statsmodels.formula.api as smf
import numpy as np
x1 = ([0,1,2,3,4])
y = ([1,2,3,2,1])
would you please show me some way to get these?
Solution 1:[1]
This first method doesn't use the formula api.
import statsmodels.api as sm
import numpy as np
x1 = np.array([0,1,2,3,4])
y = np.array([1,2,3,2,1])
x1 = x1[:, None] # Transform into a (5,1) atrray
res = sm.OLS(y,x1).fit()
print(res.summary())
If you want to use the formula interface, you need to build a DataFrame, and then the regression is "y ~ x1" (if you want a constant you need to include +1 on the right-hand-side of the formula.
import statsmodels.formula.api as smf
import pandas as pd
x1 = [0,1,2,3,4]
y = [1,2,3,2,1]
data = pd.DataFrame({"y":y,"x1":x1})
res = smf.ols("y ~ x1", data).fit()
print(res.summary())
Either produce
OLS Regression Results
==============================================================================
Dep. Variable: y R-squared: 0.000
Model: OLS Adj. R-squared: -0.333
Method: Least Squares F-statistic: 4.758e-16
Date: Wed, 17 Mar 2021 Prob (F-statistic): 1.00
Time: 22:11:40 Log-Likelihood: -5.6451
No. Observations: 5 AIC: 15.29
Df Residuals: 3 BIC: 14.51
Df Model: 1
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
Intercept 1.8000 0.748 2.405 0.095 -0.582 4.182
x1 0 0.306 0 1.000 -0.972 0.972
==============================================================================
Omnibus: nan Durbin-Watson: 1.429
Prob(Omnibus): nan Jarque-Bera (JB): 0.375
Skew: 0.344 Prob(JB): 0.829
Kurtosis: 1.847 Cond. No. 4.74
==============================================================================
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
to include an intercept in the non-formula API, you can simply use
res_constant = sm.OLS(y, sm.add_constant(x1).fit()
Solution 2:[2]
You can use sklearn's LinearRegression.
For the one without intercept (wanting to fit the model to intercept at origin), simply set the parameter fit_intercept = False
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Kevin S |
| Solution 2 | CelineDion |
