'Python: Speeding Up If-Else Function When Iterating Over Pandas DataFrame
I have a very large dataframe (585k rows) where I need to manipulate the data in one of the columns, conditional on the values in other columns.
For example, the relevant columns are:
'time' - which occasionally resets back to zero
'order_executed' - Bool value or NaN
'order_id' - id
'cum_price' - cumulative total price for the period
'single_order_id' - id associated with the order of a single product
'single_order_executed' - Bool value or NaN, associated with single_order_id
and columns associated with each order_id (in order to produce the cumulative cost associated with the time).
My problem: the total_price column does not update when the single_order_exec = True, although it is taken into account when the 'total_price' is updated on the next larger order. For example, if order_id_A places an order that cost $5, the 'cum_price' is updated to a value of 5. However, if order_id_A places a 'single_order' the 'cum_price' is not updated to a value of 6. When they place their next large order, though, for $5, the 'cum_price' is updated to a value of $11.
My goal: for the unique id's in the dataset, generate an accurate 'total_price'. I.e., I am creating a new column that tracks the total_price at any given time, in order to track the cost associated with each customer for each period.
Currently, I have the following code:
def price(order_id):
col = str(order_id)
for i in range(len(orders)):
cost = 0
if orders['time'][i] == 0:
orders[col][i] = cost
elif ((orders['order_executed'][i]==True)&(orders['order_id'][i]==order_id):
cost = orders['total_price']
orders[col][i] = cost
elif ((orders['single_order_executed'][i]==True)&(orders['single_order_id'][i]==order_id):
cost += 1
orders[col][i] = cost
else:
orders[col][i] = cost
This returns the expected result when I feed in the unique order_ids, but it takes far too long to run (hours in some cases).
Can anyone help provide a faster solution?
I've tried using a few different methods (iterating over a dict, itertuples, etc.) but I can't seem to get the syntax right.
Thanks!
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
