'Comparing every row with all other rows with pandas

my goal is to compare every row with all other rows to see how many rows are unique regarding their entries. I am quite new to pandas so I am at a loss. An exemplary dataframe would be as follows:

df = pd.DataFrame({"ID" : [1, 2, 3], 
                   "age": [46, 48, 55],
                   "gender": ['female', 'female', 'male']},
                   index = [0, 1, 2])

python pandas

Solution 1:^[1]

What do you want to obtain exactly?

If you want to know per column how many unique values you have, use nunique:

df.nunique()

ID        3
age       3
gender    2
dtype: int64

If you want to know how many unique rows (considering combinations of columns), use duplicated:

len(df) - df[['age', 'gender']].duplicated().sum()

# or 
(~df.drop(columns='ID').duplicated()).sum()

# or
(~df[['age', 'gender']].duplicated()).sum()

3

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1	mozway

'Comparing every row with all other rows with pandas

Solution 1:[1]

Sources

Related Questions

Solution 1:^[1]