'How to get number of null for each row?
How can I get count null in the row? This is simple sample. In my actual df, there are hundreds columns. For example:
val df = Seq(("Alice", "20", "Female"), ("Bob", "30", "Male"), ("Ben", null, "Male"), ("Jessica", null, null)).toDF("name","age","sex")
I want to get df like this. How can I count how many null for each row in df?
val df2 = Seq(("Alice", "20", "Female", "0"), ("Bob", "30", "Male", "0"), ("Ben", null, "Male", "1"), ("Jessica", null, null, "2")).toDF("name","age","sex","CountOfNull")
Solution 1:[1]
There are many ways to do it. Here is one:
val df = Seq(("Alice", "20", "Female"),
("Bob", "30", "Male"), ("Ben", null, "Male"),
("Jessica", null, null)).toDF("name","age","sex")
scala> df.show()
+-------+----+------+
| name| age| sex|
+-------+----+------+
| Alice| 20|Female|
| Bob| 30| Male|
| Ben|null| Male|
|Jessica|null| null|
+-------+----+------+
df.registerTempTable("test")
scala> sql("SELECT name, age, sex, nvl2(age, 0, 1) + nvl2(sex, 0, 1) as CountOfNull FROM test").show()
+-------+----+------+-----------+
| name| age| sex|CountOfNull|
+-------+----+------+-----------+
| Alice| 20|Female| 0|
| Bob| 30| Male| 0|
| Ben|null| Male| 1|
|Jessica|null| null| 2|
+-------+----+------+-----------+
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | y?s??la |


