'Saving pickle object after each iteration (slow execution time)

I have a program in which I do a regression by using functions in the statsmodels library (e.g. the glm procedure). I know the program needs to do about 10'000 iterations, and that my data that is input into the regression is fairly large.

I would like to store every result object from the regression in a dataframe. However, I find that the program operates extremely slow after about 4000 iterations. My guess is that the dataframe becomes really large in size because the data is stored in the statsmodels result object, and that is my guess as to why. Any ideas how could I do this more efficiently? Also what is this kind of problem called?

Edit: I think it might be better to just save the necessary things, like which variables were used, and some standard goodness of fit measures. Thereafter when all the 10k iterations finished, I would identify the model I want based on these measures, and re-calculate the regression object. That way I would not need to store it every iteration.

Thanks in advance.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source