'Expanding a Struct of Struct to columns in bigquery

I am working with a BQ table that has a format of a STRUCT of STRUCTs.

It looks as follows: listoffields

I would like to have a table which looks like follows:

property_hs_email_last_click_date_value currentlyinworkflow_value hs_first_engagement_object_id_value hs_first_engagement_object_id_value__st
5/5/2022 23:00:00 Y 1 'Hey'

The challenge is that there are 500 fields and I would like to make this efficient instead of writing out every single line as follows:

SELECT property_hs_email_last_click_date as property_hs_email_last_click_date_value, 
    properties.currentlyinworkflow.value as currentlyinworkflow_value, 
    properties.hs_first_engagement_object_id.value as properties.hs_first_engagement_object_id_value,
    properties.hs_first_engagement_object_id.value__st as hs_first_engagement_object_id_value__st 

Any suggestions on how to make this more efficient?

Edit: Here's a query that creates a table such as this:

create or replace table `project.database.TestTable` ( 
   property_hs_email_last_click_date STRUCT < value string >,
   properties struct < currentlyinworkflow struct < value string > , 
                      hs_first_engagement_object_id struct  < value numeric , value__st string >, 
                      first_conversion_event_name struct < value string > 
                     >  

); 

insert into `project.database.TestTable`
values (struct('12/2/2022 23:00:02'), struct(struct('Yes'), struct(1, 'Thursday'), struct('Festival')) );

insert into `project.database.TestTable`
values (struct('14/2/2021 12:00:02'), struct(struct('No'), struct(5, 'Friday'), struct('Phone')) )


Solution 1:[1]

Below is quite generic script that extracts all leaves in JSON and then presents them as columns

create temp function  extract_keys(input string) returns array<string> language js as """
  return Object.keys(JSON.parse(input));
  """;
create temp function  extract_values(input string) returns array<string> language js as """
  return Object.values(JSON.parse(input));
  """;
create temp function extract_all_leaves(input string) returns string language js as '''
  function flattenObj(obj, parent = '', res = {}){
    for(let key in obj){
        let propName = parent ? parent + '.' + key : key;
        if(typeof obj[key] == 'object'){
            flattenObj(obj[key], propName, res);
        } else {
            res[propName] = obj[key];
        }
    }
    return JSON.stringify(res);
  }
  return flattenObj(JSON.parse(input));
  ''';
create temp table temp_table as (
  select offset, key, value, format('%t', t) row_id
  from your_table t, 
  unnest([struct(to_json_string(t) as json)]),
  unnest([struct(extract_all_leaves(json) as leaves)]),
  unnest(extract_keys(leaves)) key with offset
  join unnest(extract_values(leaves)) value with offset
  using(offset) 
);  

execute immediate (select '''
  select * except(row_id) from (select * except(offset) from temp_table)
  pivot (any_value(value) for replace(key, '.', '__') in (''' || keys_list || '''
  ))'''
from (select string_agg('"' || replace(key, '.', '__') || '"', ',' order by offset) keys_list from (
  select key, min(offset) as offset from temp_table group by key
))
); 

     

if applied to sample data in your question

create temp table your_table as (
  select struct('12/2/2022 23:00:02' as value) as property_hs_email_last_click_date , 
  struct(
    struct('Yes' as value) as currentlyinworkflow , 
    struct(1 as value, 'Thursday' as value__st) as hs_first_engagement_object_id , 
    struct('Festival' as value) as first_conversion_event_name 
  ) as properties 
  union all
  select struct('14/2/2021 12:00:02'), struct(struct('No'), struct(5, 'Friday'), struct('Phone')) 
);                

the output is

enter image description here

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Mikhail Berlyant