'Filling `null` values of a column with another column

I want to fill the null values of a column with the content of another column of the same row in a lazy data frame in Polars.

Is this possible with reasonable performance?



Solution 1:[1]

There's a function for this: fill_null.

Let's say we have this data:

import polars as pl

df = pl.DataFrame({'a': [1, None, 3, 4],
                   'b': [10, 20, 30, 40]
                   }).lazy()
print(df.collect())
shape: (4, 2)
??????????????
? a    ? b   ?
? ---  ? --- ?
? i64  ? i64 ?
??????????????
? 1    ? 10  ?
??????????????
? null ? 20  ?
??????????????
? 3    ? 30  ?
??????????????
? 4    ? 40  ?
??????????????

We can fill the null values in column a with values in column b:

df.with_column(pl.col('a').fill_null(pl.col('b'))).collect()
shape: (4, 2)
?????????????
? a   ? b   ?
? --- ? --- ?
? i64 ? i64 ?
?????????????
? 1   ? 10  ?
?????????????
? 20  ? 20  ?
?????????????
? 3   ? 30  ?
?????????????
? 4   ? 40  ?
?????????????

The performance of this will be quite good.

Solution 2:[2]

I just found a possible solution:

df.with_column(
    pl.when(pl.col("c").is_null())
    .then(pl.col("b"))
    .otherwise(pl.col("a")).alias("a")
)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 cbilot
Solution 2 zareami10