'How to check if a value exist in other pandas columns which contains several values separated by comma
I have problem with my data. So I want to check a value for column A in column B which contains several values separated by comma. The result that I want is when the value is exist, it will fill column C with True, otherwise it will fill with False.
Sample table like this:
| Column_A | Column_B | Column_C |
|---|---|---|
| A | A,B,C,AA,BB,CC | True |
| B | A,AA,BB,CC | False |
| C | A,B,C | True |
I already use something like this .apply(lambda x: x.Column_A in x.Column_B, axis=1) but it resulted the second row as True because it detect B from BB. Basically my script doesn't the comma as separator for different value.
Any solution for my problem?
Solution 1:[1]
df['Column_C'] = df.apply(lambda x: x.Column_A in x.Column_B.split(','), axis=1)
Solution 2:[2]
Use split:
df['Column_C'] = df.apply(lambda x: x.Column_A in x.Column_B.split(', '), axis=1)
If performance is important use list comprehension:
df['Column_C'] = [a in b.split(', ') for a, b in zip(df.Column_A, df.Column_B)]
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | le_camerone |
| Solution 2 |
