'How to replace NaNs by average of preceding and succeeding values in pandas DataFrame?

If I have some missing values and I would like to replace all NaN with average of preceding and succeeding values, how can I do that ?.

I know I can use pandas.DataFrame.fillna with method='ffill' or method='bfill' options to replace the NaN values by preceding or succeeding values, however I would like to apply the average of those values on the dataframe instead of iterating over rows and columns.



Solution 1:[1]

Maybe late but I just had the same question and the (unique) answer in this page did not satisfy my expectations. That's why I am answering now. Your post states that you want to replace the NaNs with averages however, the interpolation is not a correct answer for me because it fills the empty cells with a linear equation. If you want to fill it with the averages of the preceding and succeeding rows. This code helped me:

dfb = df.fillna(method='bfill')
dff = df.fillna(method='ffill')
dfmeans = (dfb+dff)/2
dfmeans

For the datafrme of the example above, the result looks like

    A   B
0   1.0 0.250
1   2.1 2.125
2   3.4 2.125
3   4.7 4.000
4   5.6 12.200
5   6.8 14.400

Where you can see, at index 2 of the column A they both produce 3.4 because there the interpolation is (2.1 + 4.7)/2 but in column B the values differ.

For a one-line script and it's application to time series, you can see this post: Average between values with unevenly distributed time in Pandas DataFrame

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 eliasmaxil