'Loop through Regressions

I need to run a series of cross-sectional regreesions for several years. As such, I'm looking to automatize it all by conducting a loop. The loop needs to go through each column (years) in my dataset and run a seperate regression. Let me eloborate with more details:

Consider the following y variable - rows are data points while columsn represent each year:

y variable

In addition consider the following X variable - rows are data points while columsn represent each year:

X variable

For each year I need to run a regression using the data of all 18 rows and a single column. More specifically, if I had to do it for one year, it would look something this:

regression = sm.OLS(y.iloc[:, 0], X.iloc[:, 0])
results = regression.fit()
results.params

I could basically just save the regression output, then move on to the next year:

regression = sm.OLS(y.iloc[:, 1], X.iloc[:, 1])
results = regression.fit()
results.params

And the next year:

regression = sm.OLS(y.iloc[:, 2], X.iloc[:, 2])
results = regression.fit()
results.params

So basically, I need something that can loop through each column (year), conduct a regression, save the output, perhaps in a new dataframe if possible. What I'm looking for is the coefficients from the regression. I also need to add a constant.

Please let me know if more details are needed!



Solution 1:[1]

Create a list of all the results:

all_params = []
for col in range(X.shape[1]):
    regression = sm.OLS(y.iloc[:, 0], X.iloc[:, 0])
    results = regression.fit()
    all_params.append(results.params)

If you want a DataFrame with all the results:

df = pd.DataFrame(all_params)

If by adding a constant you mean add to all the values of the results:

df = df + const

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 SiP