'Programmatic way to loop over columns and replace null values with zeros in BigQuery?
I am trying to prepare a large data table in BigQuery for a regression that involves lots of "dummy" (aka categorical) variables.
One of final steps in this process requires me to effectively replace all instances of null values in the table with zeros.
Is there a clean and programmatic way to do this in Big Query? For example, in the table below, I'd ideally like to loop over all the "country_*" fields, and replace with zero in a non hard coded fashion. I have an inkling that this may be a job for dynamic SQL, but I get pretty lost swimming in the documentation. Any help would be greatly appreciated!
TLDR: This is an example of the data structure I'm facing.
| country | country_1 | country_2 | country_3 | other covariates |
|---|---|---|---|---|
| 1 | 1 | - | - | |
| 2 | - | 1 | - | |
| 3 | - | - | 1 |
This is what I'd like to have
| country | country_1 | country_2 | country_3 | other covariates |
|---|---|---|---|---|
| 1 | 1 | 0 | 0 | |
| 2 | 0 | 1 | 0 | |
| 3 | 0 | 0 | 1 |
Simpleton method:
select country,
ifnull(country_1, 0) as country_1,
...
FROM TABLE
Solution 1:[1]
Try below
create temp function extract_keys(input string) returns array<string> language js as "return Object.keys(JSON.parse(input));";
create temp function extract_values(input string) returns array<string> language js as "return Object.values(JSON.parse(input));";
select * except(json)
from (
select json, col, val
from your_table t,
unnest([struct(replace(to_json_string(t), ':null', ':0') as json)]),
unnest(extract_keys(json)) col with offset
join unnest(extract_values(json)) val with offset
using(offset)
)
pivot (any_value(val) for col in ('country', 'country_1', 'country_2', 'country_3'))
if applied to sample data in your question - output is
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |

