'How to have y-axis divided by the total lenght of the data in a seaborn histplot?
I have a large dataset and I am trying to plot an histogram in which the y-axis is a fraction between the number of data with a certain value and the total length of the dataset. This is my current code:
df = pd.read_csv("df.csv")
f, ax = plt.subplots(figsize=(10, 5))
df = df[df['columnA'] == 0]
sns.histplot((df['columnB']), kde=False, label='label', color='b')
plt.legend(prop={'size': 12})
plt.title('Title')
plt.xlabel('xlabel')
plt.ylabel('ylabel')
plt.show()
How can I do that? Is there a parameter for seaborn histplot that helps with it?
Solution 1:[1]
If you are referring to show the percentage instead of the actual count, you could do sns.histplot((df['columnB']), kde=False, label='label', color='b', stat = 'percent')
see here :
count: show the number of observations in each bin
frequency: show the number of observations divided by the bin width
probability: or proportion: normalize such that bar heights sum to 1
percent: normalize such that bar heights sum to 100
density: normalize such that the total area of the histogram equals 1
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Z Li |
