'Execute Fama French 3 Factor Model in R

I am attempting to create an OLS regression with the Fama French 3 Factor model but I am having problems with my understanding of what my data frame should look like to use it in a regression.

My data frame is currently a time series so it looks something like this (all tables are very simplified on both axes):

Date             Mkt-RF         HML            SMB            Fund A            Fund B
2021-01-31      -0.0125       0.0166         -0.0029          0.0181           -0.0199
2021-02-28       0.0260       0.0087          0.0573         -0.0167            0.0255
2021-03-31       0.0318      -0.0197          0.0347          0.0531            0.0557
2021-04-30       0.0529       0.0214         -0.0319          0.0018            0.0174

However, with the way a time series is set up I have only managed to build two regression models:

One where I take the average fund portfolio where the data frame looks like this:

Date             Mkt-RF         HML            SMB         Average Fund
2021-01-31      -0.0125       0.0166         -0.0029          0.0009
2021-02-28       0.0260       0.0087          0.0573          0.0088
2021-03-31       0.0318      -0.0197          0.0347          0.0544
2021-04-30       0.0529       0.0214         -0.0319          0.0096

And one where I used a regressional loop as it was explained here where the data frame looks like this:

Date             Mkt-RF         HML            SMB            Return
2021-01-31      -0.0125       0.0166         -0.0029          0.0181
2021-01-31      -0.0125       0.0166         -0.0029         -0.0199
2021-02-28       0.0260       0.0087          0.0573         -0.0167
2021-02-28       0.0260       0.0087          0.0573          0.0255
2021-03-31       0.0318      -0.0197          0.0347          0.0531
2021-03-31       0.0318      -0.0197          0.0347          0.0557
2021-04-30       0.0529       0.0214         -0.0319          0.0018
2021-04-30       0.0529       0.0214         -0.0319          0.0174

However, with the regressional loop, I will have to average the coefficients of the regression as I cannot display the results of the regression for a couple of thousand funds. This, obviously, leaves me with the same results as in the regression where I used the average fund in the regression.

The issue I have with my results is that n equals the number of months (which yes, I could replace with weeks) which in my case is rather small. Also, it feels like to me, it does not matter how many funds I include in my research as in the end they will only be a part of a portfolio, and while including more funds will make the average fund in the regression more representative of the market I do not feel like single funds or the number of funds included make a big difference in the results of the regression.

I have thought about transforming the data frame for n to be my number of funds and to use the return over the whole period but the factors obviously end up as constants and thus useless for the regression.

So, to get to my questions. Is the regression with a time series like I am doing the way to go? Is there a way I can optimise the data frame to make the regression have more value? Is there another way to have every fund in the regression and make them have a single impact on the results, except for the two ways that I have described above?

Thank you for the read and any responses!



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source