'Check if one column value is in two other columns in pandas
I have participant IDs over 13 years (13 columns that each have a diff number of participants) and I want to see how many new participants there are in each year. I'm trying to do this by comparing the values in one column with the other columns in the same dataframe. I successfully iterated through every value in baseline to see if it exists in Year1 values. However, when I try to iterate through every value in Year2 to see if it exists in either baseline or Year1, I'm getting the following error.
My code:
df["Year1_new"] = df["Year1"].isin(df["baseline"])
Year1_new_inf = sum(df['Year1_new'] == 0) #counts new participants
df["Year2_new"] = df["Year2"].isin(df["baseline","Year1"])
Year2_new_inf = sum(df['Year2_new'] == 0)
The error:
KeyError: ('baseline', 'Year1')
Thanks!
edit: df.head().to_dict()
{'baseline': {0: 1001.1, 1: 1002.1, 2: 1003.1, 3: 1004.1, 4: 1005.1},
'Year1': {0: 1001.1, 1: 1002.1, 2: 1003.1, 3: 1004.1, 4: 1005.1},
'Year2': {0: 1001.1, 1: 1002.1, 2: 1004.1, 3: 1005.1, 4: 1007.1},
'Year3': {0: 1001.1, 1: 1005.1, 2: 1006.1, 3: 1007.1, 4: 1008.1},
'Year4': {0: 1001.1, 1: 1002.1, 2: 1004.2, 3: 1005.1, 4: 1006.1},
'Year5': {0: 1001.1, 1: 1002.1, 2: 1003.1, 3: 1004.1, 4: 1005.1},
'Year6': {0: 1001.1, 1: 1004.2, 2: 1005.1, 3: 1006.1, 4: 1008.1},
'Year7': {0: 1001.1, 1: 1004.1, 2: 1005.1, 3: 1006.1, 4: 1008.1},
'Year10': {0: 1001.1, 1: 1002.1, 2: 1004.2, 3: 1005.1, 4: 1007.1},
'Year11': {0: 1001.1, 1: 1002.1, 2: 1004.2, 3: 1005.1, 4: 1007.1},
'Year12': {0: 1001.1, 1: 1004.2, 2: 1007.1, 3: 1008.1, 4: 1009.1},
'Year13': {0: 1004.2, 1: 1007.1, 2: 1008.1, 3: 1009.1, 4: 1011.1},
'Unnamed: 12': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan},
'Unnamed: 13': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan},
'Year1_new': {0: True, 1: True, 2: True, 3: True, 4: True}}
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
