'Rolling Regression Residuals Python

I hope you can help me with my problem. I want to do a rolling regression on a dataframe in Python and calculate the standard deviation on only a part of the residuals.

For example: in the table below I want to estimate parameters based on a moving window (e.g. Y = [5,7,9,10] on X_1 = [1,2,4,5] and X_2 =[2,3,4,4] which results in intercept = 2.4 and B_1 = 0.7 and B_2 = 1. These estimators lead to residuals = [4.8,0.5,-0.2,-0.2] of which the standard deviation is measured based on the last 3 residuals [0.5,-0.2,-0.2], which should be passet to the column ["standard deviation"]

Index Y X_1 X_2 Standard deviation
0 5 1 2 0.404145188
1 7 2 3 2.081665999
2 9 4 4 2.511132239
3 10 5 4 0.864408264
4 11 6 2 nan
5 14 5 5 nan
6 17 7 6 nan

My original dataset is huge, so I tried to avoid a for loop. My approach so far is to either do a regression in each row, using the following function (which does not result in the :

import statsmodels.api as sm

df["Standard deviation"] = df.rolling(window = 4).apply(lambda x: (df["Y"]-sm.OLS(df["Y"],df["X_1"]&df["X_2").fit().predict()).std())

However, the function only works on the entire column - so it is not a rolling regression and I could not find a way to only calculate the standard deviation based on the last 3 residuals.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source