'Plotting KDE's in python

Using scipy's gaussian_kde() function, one can estimate P(X=x) for 1-D data.

We could probably do something like this:

from scipy import stats
import seaborn as sns
import numpy as np
import matplotlib.pyplot as plt

data = np.random.rand(50,2)
y = np.linspace(data[:,1].min(), data[:,1].max(), 50)

kde_x = stats.gaussian_kde(data[:,0],bw_method = 'silverman')
x = np.linspace(data[:,0].min(), data[:,0].max(), 50)
fig, ax = plt.subplots(figsize=(8,6))
ax.hist(data[:,0], bins = 10, density = True,color='#0504aa', alpha=0.75, rwidth=0.90)
ax.plot(x, kde_x(x), color = 'black')

kde_y = stats.gaussian_kde(data[:,1],bw_method = 'silverman')
y = np.linspace(data[:,1].min(), data[:,1].max(), 50)
fig, ax = plt.subplots(figsize=(8,6))
ax.hist(data[:,1], bins = 10, density = True,color='#0504aa', alpha=0.75, rwidth=0.90)
ax.plot(y, kde_y(y), color = 'black')

My 1st question is, would passing a two dimensional dataset to the gaussian_kde() function be considered the p(x,y), i.e. the joint pdf?

xx, yy = np.mgrid[data[:,0].min():data[:,0].max():50j, data[:,1].min():data[:,1].max():50j]
positions = np.vstack([xx.ravel(), yy.ravel()])
values = np.vstack([p_x[:,0], p_x[:,1]])
kernel = stats.gaussian_kde(values, bw_method = 'silverman')
f = np.reshape(kernel(positions).T, xx.shape)
fig = plt.figure(figsize=(10, 7))
ax = plt.axes(projection='3d')
surf = ax.plot_surface(xx, yy, f, rstride=1, cstride=1, cmap='Blues', edgecolor='none')
ax.set_xlabel('x')
ax.set_ylabel('x')
ax.set_zlabel('PDF')
ax.set_title('Surface plot of Gaussian 2D KDE')
fig.colorbar(surf, shrink=0.5, aspect=5) # add color bar indicating the PDF
ax.view_init(50, 45)

Is p(x)*p(y) as simple as multiplying the estimated data for the 'x' and 'y' kde's? How would one show a 2-D surface_plot or heat map of p(X=x)*p(Y=y)?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source