'Inverse cumulative distribution function in MATLAB given an empirical PDF

Is it possible to generate random numbers with a distribution that depends on empirical probability data? I imagine this is possible by taking the inverse cumulative distribution function. I have seen some examples where this is done in MATLAB (the software that I'm using) but all of those examples have an underlying analytic form for the probability. Here I have only the PDF. For instance, I have data of probabilities for a particular event. Most of the probabilities are zero and hence not unique, but not all.

My goal is to generate the random numbers and then figure out what the distribution is. I'd really appreciate if people can help clear up my thinking here.

EDIT:

I think I want something like:

cdf=cumsum(pdf); % calculate pdf from empirical pdf
M=length(cdf);
xq=linspace(0,1,M);
invcdf=interp1(cdf,xq,xq); % calculate inverse cdf, i.e., x

but how do I take into account that a lot of the values of the pdf are zero and not unique? Is this even the right approach?



Solution 1:[1]

I am basing my answer on Inverse empirical cumulative distribution function from the MathWorks File Exchange. See that link for other suggestions to solving your problem.

% y  - input: data set
% q  - input: desired quantile (can be a scalar or a vector)
% xq - output: ICDF at specified quantile

[f, x] = ecdf(y);

xq = zeros(size(q));

for ii = 1:length(q)
    xq(ii) = min(x(q(ii) <= f));
end

I'd eliminate the for loop if you're only using scalars. Also, there may be a more efficient way to vectorize the for loop, but this should at least get you started.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 GrapefruitIsAwesome