'How am I able to replace duplicates in a dataframe column in python?
say my column is something like this:
trade_signal
buy
buy
buy
buy
sell
sell
sell
sell
buy
buy
buy
sell
sell
buy
sell
buy
I would like to drop the duplicate elements in the column and replace them with NAN or 0 so it would end up with something like:
trade_signal
buy
nan
nan
nan
sell
nan
nan
nan
buy
nan
nan
sell
nan
buy
sell
buy
I am completely unsure of the logic I can use to do this, I think I would forward fill up until the next change in signal with NAN values somehow?
Solution 1:[1]
Try mask with shift:
df['trade_signal'] = df['trade_signal'].mask(df['trade_signal'].eq(
df['trade_signal'].shift())
)
trade_signal
0 buy
1 NaN
2 NaN
3 NaN
4 sell
5 NaN
6 NaN
7 NaN
8 buy
9 NaN
10 NaN
11 sell
12 NaN
13 buy
14 sell
15 buy
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | anky |
