'Drop all rows id pandas df except ones mentioned in another df
I have two dataframes, one containing lot of columns
df1:
id age topic date text
1 23 Student 1.1. Lorem
2 19 Student 1.2. Cupcake
20 19 Student 1.2. Lorem Ipsum
190 21 Student 11.1. Cupcake Ipsum
And one with two columns
df2:
id count
1 105
20 4843
31 361
What I'm trying to accomplish here is to drop all the rows from df1 that are not mentioned in df2 (their id is not there).
result_df:
id age topic date text
1 23 Student 1.1. Lorem
20 19 Student 1.2. Lorem Ipsum
I've tried this but it's not working:
result_df = df1.drop(df1[df1['id'] != df2['id']].index)
Could you please help me out?
Solution 1:[1]
result_df = df1.merge(df2['id'])
Given:
df1:
id age topic
0 1 23 Student
1 2 19 Student
2 20 19 Student
3 190 21 Student
df2:
id count
0 1 105
1 20 4843
2 31 361
Doing:
result_df = df1.merge(df2['id'])
print(result_df)
Output:
id age topic
0 1 23 Student
1 20 19 Student
Solution 2:[2]
You can do:
df1 = df1.set_index('id')
df2 = df2.set_index('id')
result_df = df1.loc[df1.index.intersection(df2.index)].reset_index()
Output: result_df
id age topic date text
0 1 23 Student 1.1. Lorem
1 20 19 Student 1.2. Lorem Ipsum
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | SomeDude |
