'deleting specific duplicate and original entries in a table based on date
i have a table called "main" which has 4 columns, ID, name, DateID and Sign.
i want to create a query that will delete entries in this table if there is the same ID record in twice within a certain DateID.
i have my where clause that searches the previous 3 weeks
where DateID =((SELECT MAX( DateID)
WHERE DateID < ( SELECT MAX( DateID )-3))
e.g of my dataset im working with:
| id | name | DateID | sign |
|---|---|---|---|
| 12345 | Paul | 1915 | Up |
| 23658 | Danny | 1915 | Down |
| 37868 | Jake | 1916 | Up |
| 37542 | Elle | 1917 | Up |
| 12345 | Paul | 1917 | Down |
| 87456 | John | 1918 | Up |
| 78563 | Luke | 1919 | Up |
| 23658 | Danny | 1920 | Up |
in the case above, both entries for ID 12345 would need to be removed. however the entries for ID 23658 would need to be kept as the DateID > 3
how would this be possible?
Solution 1:[1]
You can use window functions for this.
It's not quite clear, but it seems LAG and conditional COUNT should fit what you need.
DELETE t
FROM (
SELECT *,
CountWithinDate = COUNT(CASE WHEN t.PrevDate >= t.DateId - 3 THEN 1 END) OVER (PARTITION BY t.id)
FROM (
SELECT *,
PrevDate = LAG(t.DateID) OVER (PARTITION BY t.id ORDER BY t.DateID)
FROM YourTable t
) t
) t
WHERE CountWithinDate > 0;
Note that you do not need to re-join the table, you can delete directly from the t derived table.
Solution 2:[2]
Hope this works:
DELETE FROM test_tbl
WHERE id IN (
SELECT T1.id
FROM test_tbl T1
WHERE EXISTS (SELECT 1 FROM test_tbl T2 WHERE T1.id = T2.id AND ABS(T2.dateid - T1.dateid) < 3 AND T1.dateid <> T2.dateid)
)
In case you need more logic for data processing, I would suggest using Stored Procedure.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Charlieface |
| Solution 2 | Hana |
