'How to reassign values in column by condition in dataframe?
df = pd.DataFrame([["A", "AA", "AAA", "found"],
["A", "AB", "ABA", "not found"],
["A", "AB", "ABB", "not found"],
["B", "BA", "BAA", "not found"],
["B", "BB", "BBA", "not found"],
["C", "CA", "CAA", "not found"],
["C", "CB", "CBA", "found"],
["C", "CB", "CBA", "not found"]], columns=["col_a", "col_b", "col_c", "where"])
col_a col_b col_c where
0 A AA AAA found
1 A AB ABA not found
2 A AB ABB not found
3 B BA BAA not found
4 B BB BBA not found
5 C CA CAA not found
6 C CB CBA found
7 C CB CBA not found
8 C CB CBB not found
I need to reassign values based on col_a and col_b on condition: if in any row with unique combination in col_a and col_b is "found", assign it to every row in that combination independent on other columns (e.g. col_c)
I tried to cycle through it, it kinda works but is inelegant and inefficient.
unique_col_a = df["col_a"].unique()
for bus in unique_col_a:
unique_pdus = df[df["col_a"] == bus]["col_b"].unique()
for pdu in unique_pdus:
bus_pdus_found = df[(df["col_a"] == bus) & (df["col_b"] == pdu) & (df["where"] == "found")]
if len(bus_pdus_found) > 0:
df.loc[(df["col_a"] == bus) & (df["col_b"] == pdu) & (df["where"] != "found"), "where"] = "found"
So expected result would be something like:
col_a col_b col_c where
0 A AA AAA found
1 A AB ABA not found
2 A AB ABB not found
3 B BA BAA not found
4 B BB BBA not found
5 C CA CAA not found
6 C CB CBA found
7 C CB CBA found
8 C CB CBB found
Thanks for your help! Appreciate it very much
Solution 1:[1]
Assuming column where
only contains the values found, not found
, you can group the dataframe by col_a, col_b
and transform where
with min
. This approach works because found
is always less than not found
alphabetically:
df['where'] = df.groupby(['col_a', 'col_b'])['where'].transform('min')
If the column where
contains other values as well, here is a more general solution:
m = df['where'].eq('found').groupby([df['col_a'], df['col_b']]).transform('any')
df.loc[m, 'where'] = 'found'
Result
col_a col_b col_c where
0 A AA AAA found
1 A AB ABA not found
2 A AB ABB not found
3 B BA BAA not found
4 B BB BBA not found
5 C CA CAA not found
6 C CB CBA found
7 C CB CBA found
8 C CB CBB found
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |