'Convolution and Shift of approximated distribution (DensityEstimator)

Background

Imagine a signal, which is processed multiple times. Or a process with multiple process steps in line. Every step manipulates the input and can be seen as a 'stochastic transfer-function'.

Idea

Having a set of X samples one can fit a density estimator (kde), such as the one from sklearn: https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KernelDensity.html#sklearn.neighbors.KernelDensity

from sklearn.neighbors import KernelDensity
import numpy as np
    
X = [[ 0], [ 4], [ 1], [ 4], [-1], [ 0], [0]]
kde = KernelDensity().fit(X)

Now I want to perform some arithmetic operations on the distribution to receive another distribution

  • Shift the distribution by a scalar (+- float)
  • Multiply the distribution by a scalar (* float)
  • Convolute the distribution by another distribution

The results would approximate the same operations performed on X and THEN fit the kde.

Bad approach

Of course, one could draw samples, perform the operations and then fit another kde. For a large set of samples (eg. 10^5) the results are somewhat ok, but the computation time is unacceptable.

samples = kde.sample(10**5)

shifted = KernelDensity().fit(samples + 10)
multiplied = KernelDensity().fit(samples * 2)

conv_samples = np.convolve(samples.ravel(), samples.ravel()).reshape(-1, 1)
convolved = KernelDensity().fit(conv_samples)

Also:

When re-fitting, the previous distribution gets smoothed in this approach, which is not the desired outcome for a pure shift.

blue is original, red the shifted

def plt(kde, color):
    import matplotlib.pyplot as plt

    X_plot = np.linspace(kde.tree_.data.base.min() - 5, kde.tree_.data.base.max() + 5, 1000)[:, np.newaxis]
    log_dens = kde.score_samples(X_plot)
    plt.plot(X_plot, np.exp(log_dens), color=color)
    return plt

plot = plt(kde, 'navy')
plt(shifted, 'red')
plot.show()

How can we improve this operations?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source