'To search the column title in the value (pos_tag value) of dataframe python and if it's matched put 1 else 0 as column value
Dear Python community,
I am writing a python script to perform prediction using Naive Bayes, SVM, and Decision Tree supervised learning. I already completed all the data preprocessing until getting the prediction from the data that I have.
However, there is a need to add a few new columns (name, value) in the data frame as below.
My issue is I need to check if the column name (e.g. excellent, strip, male) exists in the pos_tag_noun's value, the value in those new columns set to 1, else put 0 as shown below.
I have been working for two days to resolve this issue but still not able to have a solution for it.
Really appreciated for help if any idea or solution to resolve my issue.
Thanks & Regards
Solution 1:[1]
This should work okay. The idea is to explode your values, get_dummies, and then concat back into your dataframe.
pd.concat([df, pd.get_dummies(df['pos_tag_noun'].apply(lambda x: [item[0] for item in x]).explode()).groupby(level=-1).max()], axis=1)
pos_tag_noun bath excellent hang male strip
0 [(excellent, NN)] 0 1 0 0 0
1 [(strip, NN), (bath, NN), (hang, NN)] 1 0 1 0 1
2 [(male, NN)] 0 0 0 1 0
If you don't want to use a lambda function + explode, you can replace it with something like pd.DataFrame(df['pos_tag_noun'].to_list()).stack().
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |

