'Count symbols/punctuation in tweets

From a pandas dataframe, I would need to count punctuations by sentiment. The data is

Tweet                                                        Sentiment
Once upon a time, in the middle of nowhere, ... !                 0
What are you f*** do?                                             -1
It's a lovely day!! :)                                           1

My desired output would be

Tweet                                                        Sentiment        Punctuation_count
Once upon a time, in the middle of nowhere, ... !                 0            6
What are you f*** do?                                             -1           4
It's a lovely day!! :)                                           1             5
  

If I wanted to remove punctuation, I would used:

df["Punctuation"] = df['Tweet'].str.replace('[^\w\s]','')

But what I would like to do is count the punctuation in each Tweet.



Solution 1:[1]

One option is to simply count the number of times elements in each string appear in string.punctuation:

import string
df['Punctuation_count'] = df['Tweet'].apply(lambda x: sum(el in string.punctuation for el in x))

Output:

                                               Tweet  Sentiment  Punctuation_count
0  Once upon a time, in the middle of nowhere, ... !          0                  6
1                              What are you f*** do?         -1                  4
2                             It's a lovely day!! :)          1                  5

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1