'Python Code to display the number of rows that match

S.No FirstName LastName Department Email Matches Columnmatch   
1.   Prashant  Arora    Deli       abc   3       2
2.   Prashant  Arora    Dairy      abc   3       1 
3.   Prashant  Nan      Grocery    abc   2       2 
4.   Ash       Rana     Grocery    pqr   2       5
5.   Ash       Nan      Deli       pqr   2       4 

Here matches is the column that shows maximum number of matches i.e. row 1 and row 2 have 3 columns that match exactly and row 3 and row 2 have 2 columns that match. Column Match is the column that gives us the row number that gives the maximum number of matches with the current row. I just have a dataframe with S.No, FirstName, LastName, Department and Email. I have to compute the last 2 columns .

Please help me with this part.

Thanks Prashant Arora



Solution 1:[1]

Here is my pythonic way of solving it. I'm using the index number as column, but you can change this yourself if you desire something else.

df = pd.DataFrame({
    'FirstName' : ['Prashant', 'Prashant', 'Prashant', 'Ash', 'Ash'],
    'LastNamae' : ['Arora', 'Arora', 'Nan', 'Rana', 'Nan'],
    'Department': ['Deli', 'Dairy', 'Grocery', 'Grocery', 'Deli'],
    'Email'     : ['abc', 'abc', 'abc', 'pqr', 'pqr']
    })


for i in range(len(df)):
    matching = (df.drop(i) == df.loc[i]).sum(axis = 1)
    
    df.loc[i, 'matches'] = matching.max()
    df.loc[i, 'columnmatch'] = int(matching.idxmax())

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Tobias Molenaar