'How to have y-axis divided by the total lenght of the data in a seaborn histplot?

I have a large dataset and I am trying to plot an histogram in which the y-axis is a fraction between the number of data with a certain value and the total length of the dataset. This is my current code:

df = pd.read_csv("df.csv")

f, ax = plt.subplots(figsize=(10, 5))
df = df[df['columnA'] == 0]
sns.histplot((df['columnB']),  kde=False, label='label', color='b')
plt.legend(prop={'size': 12})
plt.title('Title')
plt.xlabel('xlabel')
plt.ylabel('ylabel')
plt.show()

How can I do that? Is there a parameter for seaborn histplot that helps with it?



Solution 1:[1]

If you are referring to show the percentage instead of the actual count, you could do sns.histplot((df['columnB']), kde=False, label='label', color='b', stat = 'percent')

see here :

count: show the number of observations in each bin

frequency: show the number of observations divided by the bin width

probability: or proportion: normalize such that bar heights sum to 1

percent: normalize such that bar heights sum to 100

density: normalize such that the total area of the histogram equals 1

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Z Li