'Snowflake query performance with UNION
We have a bunch of queries where we UNION data from 3 tables at query time ( we get data from 3 sources ) .
I was wondering if query performance would be any better if we were to merge the data into one table ( with a column source so we know where it came from ).
New table would be much bigger so I’m not sure if we should expect any better performance. Is there a general guidance around this?
Solution 1:[1]
There should be no significant difference scanning 3 tables VS scanning 1 table with the merged content.
However, please make sure you're using UNION ALL and not UNION. According to the SQL standard, UNION in SQL eliminates duplicate records, and the process of doing that can be very expensive.
Using UNION where UNION ALL should be used is one of the most common mistakes I've seen in SQL, unfortunately. I blame the standard, not the users though :)
See e.g. here for more discussion.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Marcin Zukowski |
