'How do I calculate the percentage (counted non-numerical values) in Pandas?
Basically, I have the columns date and intensity which I have grouped by date this way:
intensity = dataframe_scraped.groupby(["date","intensity"]).count()['sentiment']
which yielded the following results:
date intensity
2021-01 negative 33
neutral 72
positive 44
strong_negative 24
strong_positive 22
..
2022-05 positive 13
strong_negative 20
strong_positive 16
weak_negative 12
weak_positive 18
I want to calculate the percentages of these numerical values by date in order to bar-plot it later. Any ideas on how to achieve this?
I've tried something naïve along the lines of:
100 * dataframe_scraped.groupby(["date","intensity"]).count()['sentiment'] / dataframe_scraped.groupby(["date","intensity"]).count()['sentiment'].transform('sum')
Solution 1:[1]
I think this should work:
df.value_counts(subset=["date", "intensity"]) / df.value_counts(subset=["date"])
This counts the number of each value in the group, divided by the total number in the date group (so this would be negative's 33 / sum of 2021-01, for example).
The other interpretation of your question is that you wanted the proportion as a total of all counts in the whole dataframe, so you could use this:
df.value_counts(subset=["B", "C"], normalize=True)
which returns the count's proportion against all other groups.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Rawson |
