'vectorizing a sequential operation that has a conditional in Pandas

I have a Pandas dataframe with 3 columns. There is a series of booleans, a series of values, and a column that I want to fill, C. I also have an initial value for C.

A         B         C
----------------------
True     10        100
False    20        NaN
True     25        NaN
True     28        NaN
...

I want the values of column C (for C[1:]) to follow the following rule.

if A[i - 1]:
    C[i] = C[i - 1] * B[i] / B[i - 1]
else:
    C[i] = C[i - 1]

Of course this formula cannot determine C[0], but C[0] is provided.

Is there a way to do this efficiently using vectorized operations?

What I've tried:

The following command doesn't account for the sequential nature of the operation.

df.loc[df.A , 'C'] = df.C.shift(1) * df.B / df.B.shift(1)
df.loc[df.A == 0, 'C'] = df.C.shift(1)

If I were to do use an apply function to compute this I would have to probably make new shifted columns like the following, and then only run the apply for rows [1:]? But how do I get the updated previous value of C?

df["s_A"] = df.A.shift(1)
df["s_B"] = df.B.shift(1)
df["s_C"] = df.C.shift(1)
df["s_A"][0] = False; # this assumption is okay within the purposes

Should this work and is there a faster way? There may be up to 400,000 rows in total across multiple dataframes but it is not super time sensitive.

For clarity's sake I will mention that there are around 12 columns total, but only these three are relevant to this operation.

Is it possible to vectorize this operation? Is there another way it can be solved?

Thanks.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'vectorizing a sequential operation that has a conditional in Pandas

Sources

Related Questions