'Compare Source to Latest Record in Target
I want to compare the records in the source table to the latest record (in each partition) in the target table
I have already accomplished this by creating a temporary view with the latest record in each partition in the target
Source:
id|name|salary|age
------------------
1|John|2500|25
2|Mike|2500|30
Target:
id|name|salary|age|sdate
------------------------
1|John|1500|20|20190215
1|John|2000|22|20190318
Get the latest record per partition in target and compare non-key columns to corresponding non-key columns in source:
CREATE TEMPORARY VIEW latest_rec AS
SELECT id, name, salary, age
FROM ( SELECT id, name, salary, age,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY sdate DESC) AS RN
FROM target_tab) tt
WHERE tt.RN = 1
...
SELECT s.id, s.name, s.salary, s.age
FROM src_tab s
LEFT OUTER JOIN latest_rec t
ON s.id = t.id
AND MD5(s.name || s.salary || s.age) <> MD5(t.name || t.salary || t.age)
Expected Output:
id|name|salary|age|sdate
1|John|1500|20|20190215
1|John|2000|22|20190318
1|John|2500|25|20190525
2|Mike|2500|30|20190525
I was wondering if there were a better and smarter way to achieve the same functionality. Please let me know if additional information is necessary. Thanks
Solution 1:[1]
Why wouldn't you just use join and direct comparisons?
SELECT s.id, s.name, s.salary, s.age,
t.id, t.name, t.salary, t.age
FROM src_tab s LEFT JOIN
(SELECT id, name, salary, age,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY sdate DESC) AS RN
FROM target_tab
) tt
ON tt.id = s.id AND tt.RN = 1
WHERE (s.name <> t.name OR t.name IS NULL) OR
(s.salary <> t.salary OR t.salary IS NULL) OR
(s.age <> t.name OR t.age IS NULL) ;
This puts the comparisons in a single row.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Gordon Linoff |
