'How to structure my code in order to backtest a trading strategy with Pandas
My goal is to simulate the past growth of a stock portfolio based on historical stock prices. I wrote a code, that works (at least I think so). However, I am pretty sure, that the basic structure of the code is not very clever and propably makes things more complicated than they actually are. Maybe someone can help me and tell me the best procedure to solve a problem like mine.
I started with a dataframe containing historical stock prices for a number (here: 2) of stocks:
import pandas as pd import numpy as np
price_data = pd.DataFrame({'Stock_A': [5,6,10],
'Stock_B': [5,7,2]})
Than I defined a start capital (here: 1000 €). Furthermore I decide how much of my money I want to invest in Stock_A (here: 50%) and Stock_B (here: also 50%).
capital = 1000
weighting = {'Stock_A': 0.5, 'Stock_B': 0.5}
Now I can calculate, how many shares of Stock_A and Stock_B I can buy in the beginning
quantities = {key: weighting[key]*capital/price_data.get(key,0)[0] for key in weighting}
While time goes by the weights of the portfolio components will of course change, as the prices of Stock A and B move in opposite directions. So at some point the portfolio will mainly consists of Stock A, while the proportion of Stock B (value wise) gets pretty small. To correct for this, I want to restore the initial 50:50 weighting as soon as the portfolio weights deviate too much from the initial weighting (so called rebalancing). I defined a function to decide, whether rebalancing is needed or not.
def need_to_rebalance(row):
rebalance = False
for asset in assets:
if not 0.4 < row[asset] * quantities[asset] / portfolio_value < 0.6:
rebalance = True
break
return rebalance
If we perform a rebalancing, the following formula, returns the updated number of shares for Stock A and Stock B:
def rebalance(row):
for asset in assets:
quantities[asset] = weighting[asset]*portfolio_value/row[asset]
return quantities
Finally I defined a third funtion, that I can use to loop over the dataframe containing the sock prices in order to calculate the value of the portfolio based on the current number of Stocks we own. It looks like this:
def run_backtest(row):
global portfolio_value, quantities
portfolio_value = sum(np.array(row[assets]) * np.array(list(quantities.values())))
if need_to_rebalance(row):
quantities = rebalance(row)
for asset in assets:
historical_quantities[asset].append(quantities[asset])
return portfolio_value
Than I put it all to work using .apply:
historical_quantities = {}
for asset in assets:
historical_quantities[asset] = []
output = price_data.copy()
output['portfolio_value'] = price_data.apply(run_backtest, axis = 1)
output.join(pd.DataFrame(historical_quantities), rsuffix='_weight')
The result looks reasonable to me and it is basically, what I wanted to achieve. However, I was wondering, whether there is a more efficient way, to solve the problem. Somehow, doing the calculation line by line and storing all the values in the variable 'historical quantities' just to add it to the dataframe at the end doesn't look very clever to me. Furthermore I have to use a lot of global variables. Storing a lot of values from inside the functions as global variables makes the code pretty messy (In particular, if the calculations concering rebalancing get more complex, for example when including tax effects). Has someone read until here & is maybe willing to help me?
All the best
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
