'Selecting multiple schemas in a select statement

The db being accessed is on Snowflake; not certain on the storage details behind the scenes.

I have a query right now that creates a new view from 41 data tables stored in separate schemas under the same database, looks something like this:

CREATE VIEW all_data AS
SELECT * FROM db.schema1.data UNION ALL
SELECT * FROM db.schema2.data UNION ALL
SELECT * FROM db.schema3.data

This query is run daily. My issue is I get new data tables added every few days and I have to go manually edit the query to include those new tables, as they're stored under separate schemas (and the naming scheme for the schemas isn't consistent either, for reasons outside my control). Is there a way I can select all the schemas inside a database with a subquery that would allow me to run the query daily without needing manual updates when new schemas + tables are added?

I'd like the resulting query to have a structure somewhat like

CREATE VIEW all_data as 
SELECT * FROM [SELECT schemas from db].data

but not sure how that would work, and how to union the resulting data correctly.



Solution 1:[1]

Unfortunately, in Snowflake you can't dynamically construct SQL statements (yet). You can of course do what you want to achieve via a script in one of the supported languages (e.g. Python, JS), by first finding all the schemas and then constructing a full SQL statement.

Hope this helps.

Solution 2:[2]

You can definitely query the table and schema list available. SQL Authority has a good article on it: http://blog.sqlauthority.com/2009/06/17/sql-server-list-schema-name-and-table-name-for-database/

In short the query winds up being along these lines to pull the list of tables and schema:

SELECT '['+SCHEMA_NAME(schema_id)+'].['+name+']'
AS SchemaTable
FROM sys.tables

Though you will have to add a database name to the where clause to point to the proper DB.

Solution 3:[3]

With the release of Snowflake Scripting dynamic recreation of the view inside Snowflake is now very possible.

create database dynamic_views;
create schema dynamic_views.schema_base;
create schema dynamic_views.schema1;
create table dynamic_views.schema1.data(id int) as select * from values (1);

We can use the INFORMATION_SCEMA.TABLES to find all DATA tables:

SELECT table_schema 
FROM dynamic_views.information_schema.tables 
WHERE table_name = 'DATA';
TABLE_SCHEMA
SCHEMA1

and now push that into a cursor and build up a view creation SQL

This SQL needs to be run in the new Snowsight Console see (Working with Classic Console):

declare
  sql text;
  add_union boolean := false;
  c1 cursor for SELECT TABLE_SCHEMA 
        FROM dynamic_views.information_schema.TABLES 
        WHERE TABLE_NAME = 'DATA';
begin
  sql := 'CREATE OR REPLACE VIEW dynamic_views.schema_base.all_data AS ';
  for record in c1 do
    if (add_union) then
         sql := sql || 'UNION ALL ';
    end if;
    sql := sql || 'SELECT * FROM dynamic_views.'|| record.TABLE_SCHEMA ||'.data ';
    add_union := true;
  end for;
  
  EXECUTE IMMEDIATE sql;
  return sql;
end;
;

and we can use it:

select * from dynamic_views.schema_base.all_data;
ID
1

and add more:

create schema dynamic_views.schema2;
create table dynamic_views.schema2.data(id int) as select * from values (2);

rebuild:

anonymous block
CREATE OR REPLACE VIEW dynamic_views.schema_base.all_data AS SELECT * FROM dynamic_views.SCHEMA1.data UNION ALL SELECT * FROM dynamic_views.SCHEMA2.data

use it again:

select * from dynamic_views.schema_base.all_data;
ID
1
2

Note: You should not use SELECT * in production as the order of the table columns will be dependent of the create orders, and if newer tables have a different shape you view will become invalid.

So the explicit form really should be used:

'SELECT column1, column2, column4 FROM dynamic_views.'|| record.TABLE_SCHEMA ||'.data ';

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Marcin Zukowski
Solution 2 Zi0n1
Solution 3 Simeon Pilgrim