'How to iterate over columns, with similar names, and check whether they are equal?

This question is a follow-up to a similar one I posted a few days ago.

I have a data frame -- example shown below -- containing columns with similar names: 'Agreement Date', 'Agreement Date_x', etc. I am trying to figure out how to check whether my columns match up, and then create a new yes/no column if they do or do not match up.

Contract ID Agreement Date Agreement Date_x
0 05/04/1997 05/04/1997
1 03/02/1997 03/04/1997
2 05/07/2020 05/06/2021

Note: my actual dataset contains a dozen 'Agreement Date' columns

Many thanks :)



Solution 1:[1]

You could filter on the leading name and count if the value is unique:

df['match'] = df.filter(regex='^Agreement Date').nunique(axis=1).eq(1)

output:

   Contract ID Agreement Date Agreement Date_x  match
0            0     05/04/1997       05/04/1997   True
1            1     03/02/1997       03/04/1997  False
2            2     05/07/2020       05/06/2021  False

as yes/no:

import numpy as np
df['match'] = np.where(df.filter(regex='^Agreement Date').nunique(axis=1).eq(1), 'yes', 'no')

output:

   Contract ID Agreement Date Agreement Date_x match
0            0     05/04/1997       05/04/1997   yes
1            1     03/02/1997       03/04/1997    no
2            2     05/07/2020       05/06/2021    no

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 mozway