'Filter out rows that are the same in one column but have multiple values in another columns respectively in dataframe
To say I have a dataframe with three columns like:
| index | A | B | C |
|---|---|---|---|
| 1 | foo | One | 1 |
| 2 | foo | Two | 2 |
| 3 | foo | Three | 3 |
| 4 | bar | One | 2 |
| 5 | bar | One | 1 |
| 6 | num | Two | 3 |
| 7 | num | Three | 3 |
In this case, how may I filter out the rows that have the same value in column B but more than one respective value in column C by using Python Pandas?
The rows that I need is 1, 2, 4, 5, 6 because "One" in column B has two corresponding values (1 and 2) in column C and "Two" in column B has two corresponding values as well. Eventually I want to group them by column A if possible.
Solution 1:[1]
You can try groupby B column then filter by the value_counts of C column.
out = df.groupby('B').filter(lambda group: len(group['C'].value_counts()) > 1)
print(out)
index A B C
0 1 foo One 1
1 2 foo Two 2
3 4 bar One 2
4 5 bar One 1
5 6 num Two 3
Solution 2:[2]
Not an optimised solution but will get your work done:
import pandas as pd
# create dataframe
df = pd.DataFrame([['foo','One',1],['foo','Two',2],['foo','Three',3],['bar','One',2], ['bar','One',1],['num','Two',3],['num','Three',3]], index = range(1,8), columns = ['A','B','C'])
# get the unique values present in column B
values = list(df['B'].unique())
result = pd.DataFrame()
# iterate through the unique values and for each unique value check the corresponding values in C
for val in values:
unique_values = list(df[df['B'] == val]['C'].unique())
# if the unique values in column C is more than 1, it satisfies your condition and hence can be added into your result dataFrame.
if len(unique_values) > 1:
result = result.append(df[df['B'] == val])
print(result)
The result is the rows 1, 2, 4, 5, 6.
Always show your work in the question.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Ynjxsjmh |
| Solution 2 | Echo |
