'Snowflake - Reconciliation of two tables based on keys
I have two tables (primary and secondary) and we need to do a row and column-level reconciliation between these two tables and get a summary of the differences between these tables.
Table A:
| col_A | col_B | col_C |
|---|---|---|
| One | Two | Three |
| Four | Five | Six |
| Seven | Eight | Nine |
Table B:
| col_A | col_B | col_C |
|---|---|---|
| One | Two | Three |
| Four | Five | ABC |
| Seven | Eight | Nine |
| Nine | Eight | Nine |
In the above table col_A is the primary key column. I want to compare Table A and Table B and produce results like below.
Matched Rows: 2
Unmatched Rows: 1
Columns not matching: col_C (sample key: Four)
Rows Present in Table A but not in B: 0
Rows Present in Table B but not in A: 1 ( Sample key:Nine)
Generally, Table A and Table B have approx. billion rows. What would be the efficient way to do it in Snowflake.
Solution 1:[1]
For row comparison, consider the MINUS/EXCEPT set operator.
SELECT count(*) as countOfRowsInTable1NotInTable2 FROM
(
SELECT col_A, col_B, col_C FROM table1
MINUS
SELECT col_A, col_B, col_C FROM table2
)sub;
You can reverse the order of the inner SELECT statments to get the same count for tableB. You can also perform a SELECT * instead of SELECT COUNT(*) if you want to see which rows exist in one table but not the other.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | JNevill |
