'Handle nan values in rolling operations

I'm testing rolling operation and I have the following problem:

import polars as pl
import numpy as np

df = pl.DataFrame(
    {
        "values": [np.nan, 1, 1, 2, 4, 5, 3]
    }
)

df = df.select(
    [
        pl.all(),
        pl.col("values").rolling_apply(lambda s: s.min(), window_size=2).alias("rolling rank"),
        pl.col("values").rolling_min(window_size=2).alias("rolling min native")
    ]
)

print(df)

This code doesn't work but if I remove np.nan it works perfectly. Is there a bug in rolling operations?

File "/usr/local/anaconda3/envs/learn_polars/lib/python3.10/site-packages/polars/internals/lazy_frame.py", line 476, in collect
    return self._dataframe_class._from_pydf(ldf.collect())
pyo3_runtime.PanicException: called `Option::unwrap()` on a `None` value

In addiction, this works:

df = df.select(
    [
        pl.all(),
        pl.col("values").rolling_apply(lambda s: s.min(), window_size=2).alias("rolling rank"),
    ]
)

I also tried min_period but nothing changed.

Polars version, provided by pip: 0.13.16

Thanks



Solution 1:[1]

This issue is now fixed in Polars 0.13.21. If you update you should be able to take advantage of the better performance of the function-specific rolling_ methods (e.g., rolling_min, rolling_quantile, etc..)

df.with_columns(
    [
        pl.col("values").rolling_apply(lambda s: s.min(), window_size=2).alias("rolling min"),
        pl.col("values").rolling_min(window_size=2).alias("rolling min native")
    ]
)
shape: (7, 3)
?????????????????????????????????????????????
? values ? rolling min ? rolling min native ?
? ---    ? ---         ? ---                ?
? f64    ? f64         ? f64                ?
?????????????????????????????????????????????
? NaN    ? null        ? null               ?
?????????????????????????????????????????????
? 1.0    ? 1.0         ? NaN                ?
?????????????????????????????????????????????
? 1.0    ? 1.0         ? 1.0                ?
?????????????????????????????????????????????
? 2.0    ? 1.0         ? 1.0                ?
?????????????????????????????????????????????
? 4.0    ? 2.0         ? 2.0                ?
?????????????????????????????????????????????
? 5.0    ? 4.0         ? 4.0                ?
?????????????????????????????????????????????
? 3.0    ? 3.0         ? 3.0                ?
?????????????????????????????????????????????

Note that the optimized, function-specific rolling_ methods may treat NaN values differently than a generic rolling_apply.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 cbilot