'Having issues doing a spline interpolation between columns using Pandas

I have the following dataframe

         index  Flux_min  New_Flux  Flux_max
0            0  0.550613       NaN  0.537315
1            1  0.656621       NaN  0.620647
2            2  0.756486       NaN  0.700822
3            3  0.846038       NaN  0.775749
4            4  0.920871       NaN  0.843257
...        ...       ...       ...       ...
99997    99997  0.874460       NaN  0.805594
99998    99998  0.801958       NaN  0.743039
99999    99999  0.721355       NaN  0.676436
100000  100000  0.635054       NaN  0.606967
100001  100001  0.552247       NaN  0.535789

What I want to do is to perform a spline interpolation (or any other kind of interpolation that is not linear) between the Flux_min and Flux_max columns in order to figure out the values for the New_Flux column.

This is my code to do that

synth_spectra_df = synth_spectra_df.interpolate(method='spline', order=2, axis=1)

where synth_spectra_df is the dataframe shown above. If I run this code, I get the following error:

Traceback (most recent call last):
  File "spectral_interpolation.py", line 107, in <module>
    synth_spectra_df = synth_spectra_df.interpolate(method='spline', order=2, axis=1)
  File "/home/fmendez/anaconda3/lib/python3.8/site-packages/pandas/core/generic.py", line 7209, in interpolate
    raise ValueError(
ValueError: Index column must be numeric or datetime type when using spline method other than linear. Try setting a numeric or datetime index column before interpolating.

I already made sure that all the columns are the numeric data type, and also tried to not have the index as a column, but I still get the same error. If I do the linear interpolation it doesn't complain, but I'm not interested in a linear interpolation.

Any help would be very appreciated



Solution 1:[1]

In order to interpolate, you need to have values in a column. In particular, if you read the documentation of that method, you will see an example where they explicitly say that an element can't be filled in due to it not having preceding entries.

Note how the first entry in column ‘b’ remains NaN, because there is no entry before it to use for interpolation.

This is from the documentation:

df = pd.DataFrame([(0.0, np.nan, -1.0, 1.0),
                   (np.nan, 2.0, np.nan, np.nan),
                   (2.0, 3.0, np.nan, 9.0),
                   (np.nan, 4.0, -4.0, 16.0)],
                  columns=list('abcd'))
df
     a    b    c     d
0  0.0  NaN -1.0   1.0
1  NaN  2.0  NaN   NaN
2  2.0  3.0  NaN   9.0
3  NaN  4.0 -4.0  16.0
df.interpolate(method='linear', limit_direction='forward', axis=0)
     a    b    c     d
0  0.0  NaN -1.0   1.0
1  1.0  2.0 -2.0   5.0
2  2.0  3.0 -3.0   9.0
3  2.0  4.0 -4.0  16.0 

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Jacob Sushenok