'How to update a Dataframe values from an another Dataframe based on condition
I'm trying to update a "qty" column in a Dataframe based on another Dataframe "qty" column only for specific rows (according to specific types).
Here are my example Dataframes :
df = pd.DataFrame({'op': ['A', 'A', 'A', 'B', 'C'], 'type': ['X', 'Y', 'Z', 'X', 'Z'], 'qty': [3, 1, 8, 0, 4]})
df_xy = pd.DataFrame({'op': ['A', 'B', 'C'], 'qty': [10, 20, 30]})
print(df)
print(df_xy)
op type qty
0 A X 3
1 A Y 1
2 A Z 8
3 B X 0
4 C Z 4
op qty
0 A 10
1 B 20
2 C 30
I try to use the loc function to choose the concerned rows and to compare with the other Dataframe according to my reference column "op" but without success
# Select df rows where "type" is in "types" and set "qty" according to "qty" from df_xy
types = ['X', 'Y']
df.loc[df['type'].isin(types), 'qty'] = df_xy.loc[df_xy['op'] == df['op'], 'qty']
print(df)
I would like to have a Dataframe that is like this :
op type qty
0 A X 10
1 A Y 10
2 A Z 8
3 B X 20
4 C Z 4
But I have an error specifying that I cannot compare Series Objects that are not labeled the same way
ValueError: Can only compare identically-labeled Series objects
Any help is much appreciated! Thanks in advance!
Solution 1:[1]
You could combine loc and merge to align your 2 Series:
df.loc[df['type'].isin(types), 'qty'] = df[['op']].merge(df_xy, on='op')['qty']
output:
op type qty
0 A X 10
1 A Y 10
2 A Z 8
3 B X 20
4 C Z 4
Solution 2:[2]
Use Series.map only for filtered rows in both sides for avoid processing not matched rows, here Z rows:
types = ['X', 'Y']
mask = df['type'].isin(types)
df.loc[mask, 'qty'] = df.loc[mask, 'op'].map(df_xy.set_index('op')['qty'])
print (df)
op type qty
0 A X 10
1 A Y 10
2 A Z 8
3 B X 20
4 C Z 4
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | mozway |
| Solution 2 | jezrael |
