'Filling `null` values of a column with another column
I want to fill the null values of a column with the content of another column of the same row in a lazy data frame in Polars.
Is this possible with reasonable performance?
Solution 1:[1]
There's a function for this: fill_null.
Let's say we have this data:
import polars as pl
df = pl.DataFrame({'a': [1, None, 3, 4],
'b': [10, 20, 30, 40]
}).lazy()
print(df.collect())
shape: (4, 2)
??????????????
? a ? b ?
? --- ? --- ?
? i64 ? i64 ?
??????????????
? 1 ? 10 ?
??????????????
? null ? 20 ?
??????????????
? 3 ? 30 ?
??????????????
? 4 ? 40 ?
??????????????
We can fill the null values in column a with values in column b:
df.with_column(pl.col('a').fill_null(pl.col('b'))).collect()
shape: (4, 2)
?????????????
? a ? b ?
? --- ? --- ?
? i64 ? i64 ?
?????????????
? 1 ? 10 ?
?????????????
? 20 ? 20 ?
?????????????
? 3 ? 30 ?
?????????????
? 4 ? 40 ?
?????????????
The performance of this will be quite good.
Solution 2:[2]
I just found a possible solution:
df.with_column(
pl.when(pl.col("c").is_null())
.then(pl.col("b"))
.otherwise(pl.col("a")).alias("a")
)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | cbilot |
| Solution 2 | zareami10 |
