'How to update pandas's column if they have the same columns's value?
Let's say, I have two original DataFrames like this:
df1 = pd.DataFrame({"ID": [101, 102, 103], "Price":[12, 33, 44], "something":[12,22,11]})
df2 = pd.DataFrame({"ID": [101, 103], "Price":[122, 133]})
And it displays like this:
ID Price something
0 101 12 12
1 102 33 22
2 103 44 11
And
ID Price
0 101 122
1 103 133
Since I don't set any indices for any column, I want to know how can I update the df1 if both DataFrames have the same ID. For this sample, I hope I can get the result like this:
ID Price something
0 101 122 12
1 102 33 22
2 103 133 11
You can see, I only care about the price column. Here is what I have tried for now:
pd.concat([df1,df2]).drop_duplicates(['ID'],keep='last')
But it just shows me:
ID Price something
1 102 33 22.0
0 101 122 NaN
1 103 133 NaN
I don't want any other column values to be changed.
I'd like to keep the order of the rows of df1.
UPDATE
After running the answer code, and I found the order of the columns will change, since we are using reset_index, something about index. so I hope someone can point me out how to keep the original position of my DataFrame. For now, it looks like below:
In [180]: df1 = pd.DataFrame({"ss":[12,22,11], "ID": [101, 102, 103], "Price":[12, 33, 44], "something":[12,22,11]})
...: df2 = pd.DataFrame({"ID": [101, 103], "Price":[122, 133]})
In [181]: df1.set_index('ID',inplace=True)
...: df1.update(df2.set_index('ID'))
...: df1.reset_index(inplace=True)
In [182]: df1
Out[182]:
ID ss Price something
0 101 12 122.0 12
1 102 22 33.0 22
2 103 11 133.0 11
Solution 1:[1]
Using np.where and isin update your price in df1 after merge
df1.Price=np.where(df1.ID.isin(df2.ID),df1.merge(df2,on='ID',how='left')['Price_y'],df1.Price)
df1
ID Price something
0 101 122.0 12
1 102 33.0 22
2 103 133.0 11
Using update:
df1.set_index('ID',inplace=True)
df1.update(df2.set_index('ID'))
df1.reset_index(inplace=True)
df1
ID Price something
0 101 122.0 12
1 102 33.0 22
2 103 133.0 11
Solution 2:[2]
Another possible solution could be using combine_first()
df2.set_index(['ID']).combine_first(df1.set_index(['ID', 'something'])).reset_index()
And also by using isin()
df1.loc[df1.ID.isin(df2.ID), ['Price']] = df2[['Price']].values
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | JejeBelfort |
| Solution 2 |
