'SQL UNION ALL but with lots of columns on BigQuery?

BG Collection

Above image is a screenshot of my table just as a quick initial reference.

The focal point are the multiple mech columns (mech1, mech2, mech3, and mech4).

Board games in this tables have multiple attributes called mechanisms so I've separated them into 4 different columns.

So I've learned how to combine columns vertically via UNION ALL so that I can query the count of all unique game mechanisms in my table.

However, it got me wondering if there's a shorter and more efficient way to achieve what I've done:

WITH mechanism_info AS
        (
            WITH
                mechanism_col_combined AS
                    (
                        SELECT mech1 AS all_mech_columns_combined
                        FROM `ckda-portfolio-2022.bg_collection.base`
                        UNION ALL
                        ## There's no IS NOT NULL condition defined for column 'mech1' since there's at least one mechanism noted for a game.
                        SELECT mech2
                        FROM `ckda-portfolio-2022.bg_collection.base`
                        WHERE mech2 IS NOT NULL
                        UNION ALL
                        SELECT mech3
                        FROM `ckda-portfolio-2022.bg_collection.base`
                        WHERE mech3 IS NOT NULL
                        UNION ALL
                        SELECT mech4
                        FROM `ckda-portfolio-2022.bg_collection.base`
                        WHERE mech4 IS NOT NULL
                    )
                    ## Temporary table with all mechanism column in the collection combined.
            SELECT DISTINCT(all_mech_columns_combined) AS unique_mechanisms, COUNT(*) AS count
            FROM mechanism_col_combined
            GROUP BY all_mech_columns_combined
            ORDER BY all_mech_columns_combined
        )
SELECT *
FROM mechanism_info

By querying this temp. table, SQL returns the information that I've anticipated as below:

unique_mechanisms | count
Acting            |   1
Action Points     |   3
Action Queue      |   1
Action Retrieval  |   1
Area Movement     |   1
Auction/Bidding   |   5
Bag Building      |   1
Betting & Bluffing|   2
Bingo             |   1
Bluffing          |   7

Now, I want to shorten my code and I know there has to be a way to shorten the repetitive process of combining columns with UNION ALL.

And if there's any other tips or methods on how to shorten my query, please let me know!

Thank you.



Solution 1:[1]

You can convert the multiple columns [mech1, mech2, ...] into a column of array mech_arr and then using UNNEST to convert the column to have scalar value in each row.

For example:

WITH table1 AS (
    SELECT 'AA' AS mech1, 'BB' AS mech2, 'CC' AS mech3,
    UNION ALL SELECT 'AA' AS mech1, 'CC' AS mech2, 'EE' AS mech3
),
table2 AS (SELECT [mech1, mech2, mech3] AS mech_arr FROM table1)

SELECT mech, COUNT(*) AS mech_counts
FROM table2, UNNEST(mech_arr) AS mech
GROUP BY mech

Output

mech    mech_counts
AA  2
BB  1
CC  2
EE  1

Solution 2:[2]

You could send join into the table, but the performance would not improve and the query would be just as long.
You can simplify as follows:

SELECT
  mech_column,
  count(*) "number"
FROM  (
       SELECT mech1 AS mech_column
       FROM `ckda-portfolio-2022.bg_collection.base`
         UNION ALL
       SELECT mech2
       FROM `ckda-portfolio-2022.bg_collection.base`
         UNION ALL
       SELECT mech3
       FROM `ckda-portfolio-2022.bg_collection.base`
         UNION ALL
       SELECT mech4
       FROM `ckda-portfolio-2022.bg_collection.base`
       ) m
WHERE mech_column IS NOT NULL
GROUP BY mech_column
ORDER BY mech_column;
       

Solution 3:[3]

Didn't find a smoother way to query but I did find a way to remove the process of adding WHERE column IS NOT NULL for each and every columns that was used to vertically aggregate them into a single column:

mechanism_info AS
    (
        WITH
            mechanism_col_combined AS
                (
                    SELECT mech1 AS mech_columns
                    FROM `ckda-portfolio-2022.bg_collection.base`
                    UNION ALL
                    SELECT mech2
                    FROM `ckda-portfolio-2022.bg_collection.base`
                    UNION ALL
                    SELECT mech3
                    FROM `ckda-portfolio-2022.bg_collection.base`
                    UNION ALL
                    SELECT mech4
                    FROM `ckda-portfolio-2022.bg_collection.base`
                    ## Removed all WHERE clause from the above columns 
                    and added it below instead.
                )
                ## Temporary table with all mechanism columns in the collection combined.
        SELECT DISTINCT(mech_columns) AS mechanisms, COUNT(*) AS count
        FROM mechanism_col_combined
        WHERE mech_columns IS NOT NULL ## <--- Added here!
        GROUP BY mech_columns
        ORDER BY mech_columns
    )

SELECT * FROM mechanism_info

Since mechanism_info is a nested temp. table, I can just add WHERE mech_columns IS NOT NULL clause and condition to the initial temp. table's setting.

I'm still looking to reduce this query down to something more efficient. It's unfortunate that UNION ALL can't select multiple columns with a single call :(

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 dikesh
Solution 2
Solution 3 LordPuggo