'To fit Linear regression Model with and without intercept in python

I need to fit Linear regression Model 1 : y = β1x1 + ε and Model 2: y = β0 + β1x1 + ε, to the data x1 = ([0,1,2,3,4]) y = ([1,2,3,2,1]). My objective is to find coefficients, squared error loss, the absolute error loss, and the L1.5 loss for both model.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn import linear_model
import statsmodels.formula.api as smf
import numpy as np

x1 = ([0,1,2,3,4])
y = ([1,2,3,2,1])

would you please show me some way to get these?



Solution 1:[1]

This first method doesn't use the formula api.

import statsmodels.api as sm
import numpy as np

x1 = np.array([0,1,2,3,4])
y = np.array([1,2,3,2,1])
x1 = x1[:, None] # Transform into a (5,1) atrray

res = sm.OLS(y,x1).fit()

print(res.summary())

If you want to use the formula interface, you need to build a DataFrame, and then the regression is "y ~ x1" (if you want a constant you need to include +1 on the right-hand-side of the formula.

import statsmodels.formula.api as smf
import pandas as pd

x1 = [0,1,2,3,4]
y = [1,2,3,2,1]
data = pd.DataFrame({"y":y,"x1":x1})
res = smf.ols("y ~ x1", data).fit()
print(res.summary())

Either produce

                            OLS Regression Results
==============================================================================
Dep. Variable:                      y   R-squared:                       0.000
Model:                            OLS   Adj. R-squared:                 -0.333
Method:                 Least Squares   F-statistic:                 4.758e-16
Date:                Wed, 17 Mar 2021   Prob (F-statistic):               1.00
Time:                        22:11:40   Log-Likelihood:                -5.6451
No. Observations:                   5   AIC:                             15.29
Df Residuals:                       3   BIC:                             14.51
Df Model:                           1
Covariance Type:            nonrobust
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept      1.8000      0.748      2.405      0.095      -0.582       4.182
x1                  0      0.306          0      1.000      -0.972       0.972
==============================================================================
Omnibus:                          nan   Durbin-Watson:                   1.429
Prob(Omnibus):                    nan   Jarque-Bera (JB):                0.375
Skew:                           0.344   Prob(JB):                        0.829
Kurtosis:                       1.847   Cond. No.                         4.74
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

to include an intercept in the non-formula API, you can simply use

res_constant = sm.OLS(y, sm.add_constant(x1).fit()

Solution 2:[2]

You can use sklearn's LinearRegression.

For the one without intercept (wanting to fit the model to intercept at origin), simply set the parameter fit_intercept = False

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Kevin S
Solution 2 CelineDion