'How to draw the consistent Probability Density Function (PDF) plot regardless of sample size in Python?

I have a question about drawing Probability Density Function (PDF) plot regardless of sample size in Python.

This is my code.

# Library
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import scipy.stats as stats

# Data frame
x = np.random.normal(45, 9, 1000)
source = {"Genotype": ["CV1"]*1000, "AGW": x}
df=pd.DataFrame(source)

# Calculating PDF
df_mean = np.mean(df["AGW"])
df_std = np.std(df["AGW"])
pdf = stats.norm.pdf(df["AGW"].sort_values(), df_mean, df_std)

# Graph
plt.plot(df["AGW"].sort_values(), pdf, color="black")
plt.xlim([0,90])
plt.xlabel("Grain weight (mg)", size=12)
plt.ylabel("Frequency", size=12)
plt.grid(True, alpha=0.3, linestyle="--")
plt.show()

enter image description here

and this is the graph. However, when I change the sample number from 1000 to 100 such as x = np.random.normal(45, 9, 100), the graph shape is changed.

enter image description here

This is because lack of sample size cannot represent full normal distribution. If we draw a normal distribution graph in Excel with limited sample size, we can find the same problem.

However, in R, stat_function() always provides the same shape of normal distribution graph regardless of sample size.

In R, when I run the below code, I can obtain the same shape of normal distribution graph regardless of sample size. It assume that the full normal distribution in given mean and standard deviation.

Could you let me know how I can get such a consistent normal distribution graph in Python like R? Regardless of sample size, I'd like to obtain the same shape of normal distribution graph in Python.

Always, many thanks!!

AGW<-rnorm(100, mean=45, sd=9)
Genotype<-c(rep("CV1",100))

df<- data.frame (Genotype, AGW)

ggplot () +
  stat_function(data=df, aes(x=AGW), color="Black", size=1, fun = dnorm, 
                args = c(mean = mean(df$AGW), sd = sd(df$AGW))) + 
  scale_x_continuous(breaks = seq(0,90,10),limits = c(0,90)) + 
  scale_y_continuous(breaks = seq(0,0.05,0.01), limits = c(0,0.05)) +
  labs(x="Grain weight (mg)", y="Frequency") +
  theme_grey(base_size=15, base_family="serif")+
  theme(axis.line= element_line(size=0.5, colour="black")) +
  windows(width=6, height=5)

enter image description here



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source