'How to ignore the successive same values of a column considering pandas dataframe?

I have this pandas dataframe:

df

I want to create a new column "entry_price" that for each day, it considers the first "buy" value in entry, and writes the associated "open" in this column,

This is an example of what dataframe i want to have: (but maybe there's a better way) desidered df

So as you can see, i need to consider only the first "buy" of the day,

I tried with no success this method:

df['entry_price'] = df['open'].where(df['entry'] == "buy")

this method does not ignore the successive "buy" values: does not consider the next "buy" of the same day as a "nan". Any ideas?



Solution 1:[1]

You should actually filter your dataframe only where entry == 'buy', create a new date format only with day and then use groupby method using only the minimum date

data = {"date": ["2022-02-28 06:00:00", "2022-02-28 06:00:05", "2022-03-01 06:59:35", "2022-03-01 06:59:40"],"entry": ["no", "buy", "buy", "buy"], "open": [1.12, 1.13, 1.135, 1.132]}

df = pd.DataFrame(data)
df["day"] = df["date"].apply(lambda elem: elem.split(" ")[0])
# indentify the dates index
dates = df[df['entry'] == 'buy'].groupby("day")["date"].apply(min)
df[df["date"].isin(dates.values)]

                  date entry   open         day
1  2022-02-28 06:00:05   buy  1.130  2022-02-28
2  2022-03-01 06:59:35   buy  1.135  2022-03-01

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 pac