'Can I vectorize scipy.interpolate.interp1d

interp1d works excellently for the individual datasets that I have, however I have in excess of 5 million datasets that I need to have interpolated.

I need the interpolation to be cubic and there should be one interpolation per subset.

Right now I am able to do this with a for loop, however, for 5 million sets to be interpolated, this takes quite some time (15 minutes):

interpolants = []
for i in range(5000000):             
    interpolants.append(interp1d(xArray[i],interpData[i],kind='cubic'))

What I'd like to do would maybe look something like this:

interpolants = interp1d(xArray, interpData, kind='cubic')

This however fails, with the error:

ValueError: x and y arrays must be equal in length along interpolation axis.

Both my x array (xArray) and my y array (interpData) have identical dimensions...

I could parallelize the for loop, but that would only give me a small increase in speed, I'd greatly prefer to vectorize the operation.



Solution 1:[1]

I have also been trying to do something similar over the past few days. I finally managed to do it with np.vectorize, using function signatures. Try with the code snippet below:

fn_vectorized = np.vectorize(interpolate.interp1d,
                                     signature='(n),(n)->()')
interp_fn_array = fn_vectorized(x[np.newaxis, :, :], y)

x and y are arrays of shape (m x n). The objective was to generate an array of interpolation functions, for row i of x and row i of y. The array interp_fn_array contains the interpolation functions (shape is (1 x m).

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Engineero