'For each value in a column, how to count the number of unique values in its row?

Suppose I have a data frame:

ID       person_1     person_2
ID_001   Aaron        Ben
ID_003   Kate         Ben
ID_001   Aaron        Lou
ID_005   Lee          Ben
ID_006   Aaron        Cassie
ID_001   Tim          Ben
ID_003   Ben          Mal

For every ID in the column "ID", I want to count the number of unique names that were associated with the ID

My desired output:

ID       Count
ID_001   4
ID_003   3
ID_005   2
ID_006   2

Code for reproduction:

df = pd.DataFrame({
'ID': ["ID_001", "ID_003", "ID_001", "ID_005", "ID_006", "ID_001", "ID_003"],
'person1': ["Aaron","Kate","Aaron","Lee","Aaron","Tim","Ben"],
'person2': ["Ben","Ben","Lou","Ben","Cassie","Ben","Mal"]
})

Solution 1:^[1]

Flat your columns person1 and person2 then remove duplicated names and finally count unique value per ID:

out = df.melt('ID').drop_duplicates(['ID', 'value']) \
        .value_counts('ID').rename('Count').reset_index()
print(out)

# Output
       ID  Count
0  ID_001      4
1  ID_003      3
2  ID_005      2
3  ID_006      2

Solution 2:^[2]

Melt your dataframe together, drop duplicates, then group by ID and aggregate over the count of the variables. At least, rename the column variable to Count.

df.melt(["ID"]).drop_duplicates(["ID","value"]).groupby(
    ["ID"]).agg({"variable":"count"}).reset_index().rename(
    columns={"variable":"Count"})

   ID          Count
   ID_001      4
   ID_003      3
   ID_005      2
   ID_006      2

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1	Corralien
Solution 2

'For each value in a column, how to count the number of unique values in its row?

Solution 1:[1]

Solution 2:[2]

Sources

Related Questions

Solution 1:^[1]

Solution 2:^[2]