'How to compare a subset of two tables using dynamically generated column names in PostgreSQL?

I have two tables, one a test table and one a production table, both with +200 columns and a couple thousand lines of code to create the table. I periodically make changes and am trying to automate QA. I would like to

  1. Compare all rows between the two tables to detect differences.
  2. Exclude certain columns, either because columns are new (added to test, does not exist in prod) or because they will be different on purpose (table_creation, created_by_used_id, etc).
  3. Use a variable to generate the SELECT list_of_column_names so I do not have to continually manually update the column names I need to compare between the two tables.

#3 is the issue. I know how to do this in python, but am currently limited to doing this only in PostgreSQL and have never done anything with variables in SQL.

Code So Far

So far, I know I can get all columns names from

SELECT *
FROM information_schema.columns
WHERE table_schema = 'my_test_schema'
   AND table_name   = 'my_test_table' 

From there, I can do a FULL JOIN and WHERE clause to join with the prod columns and get a table with 1 column of only the subset column names that I want.

After that, I'm using an EXCEPT/UNION ALL script to compare the tables. The issue below is with the * - I instead need to have some sort of variable or list and use that to select the column names.

SELECT * FROM my_test_table
EXCEPT
SELECT * FROM my_prod_table

UNION ALL

SELECT * from my_prod_table
EXCEPT
SELECT * from my_test_table

I am open to alternate suggestions.



Solution 1:[1]

This will give you the columns which are present in prod table and not in test table and or the other way around:

SELECT column_name
FROM information_schema.columns
WHERE table_schema = 'my_test_schema'
   AND table_name   = 'my_test_table'
except 
SELECT column_name
FROM information_schema.columns
WHERE table_schema = 'my_prod_table'
   AND table_name   = 'my_prod_table'

UNION

SELECT column_name
FROM information_schema.columns
WHERE table_schema = 'my_prod_table'
   AND table_name   = 'my_prod_table'
except 
SELECT column_name
FROM information_schema.columns
WHERE table_schema = 'my_test_schema'
   AND table_name   = 'my_test_table'

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 nikhil sugandh