'Exctracting strings , counting and transposing them as columns in a dataframe

Have a pandas dataframe with 2 columns: tag and message:

       tag              |     message
["string1","sttring2"]  |    some text
["string","string3"]    |  another text
["string2"]             | another another text

I want to build a dataset for a multi-label classification so I need to extract all the distinct strings from tag becauuse they are my labels.

What I need:

I need to transpose this list of about 40 distinct strings in the tag and then insert the count of each relative to the message column. So the final dataframe should be like this:

      tag               |     message           string  string1   string2    string3
["string1","string2"]   |    some text             0      1          1          0        
["string","string3"]    |  another text            1      0          0          1
["string2"]             | another another text     0      0          1          0

Do note that new_df dataframe must have the 2 originals columns + ~40 new columns because there's about 40 distinct strings in tag column.

How can I do this in Julia

julia ijulia-notebook

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'Exctracting strings , counting and transposing them as columns in a dataframe

Sources

Related Questions