'Deleting duplicate values using time as criteria SQLite
After some research and being an extremely beginner at SQL I couldn't figure out how to solve this task.
The data set has duplicate data (CAPTION_TEXT), which i would like to clean BUT only should be consider duplicate data, rows within a 2 segs of difference(CAPTION_START).
Example: -Row N°9 and row N°19 have the same value. -Difference between Row's N°9 and Row's N°19 Caption_start <2 seconds. -There is a duplicate. It must be deleted.
Solution 1:[1]
Assuming that ID is the primary key of the table, you can use the function julianday():
DELETE FROM tablename AS t1
WHERE EXISTS (
SELECT 1
FROM tablename AS t2
WHERE t2.ID <> t1.ID
AND t2.CAPTION_TEXT = t1.CAPTION_TEXT
AND julianday(t1.CAPTION_START) - julianday(t2.CAPTION_START) BETWEEN 0 AND 2.0 / (24 * 60 * 60)
)
If you want to insert the rows in a new table:
INSERT INTO new_table (ID, CAPTION_START, CAPTION_END, CAPTION_TEXT)
SELECT * FROM tablename AS t1
WHERE NOT EXISTS (
SELECT 1
FROM tablename AS t2
WHERE t2.ID <> t1.ID
AND t2.CAPTION_TEXT = t1.CAPTION_TEXT
AND julianday(t1.CAPTION_START) - julianday(t2.CAPTION_START) BETWEEN 0 AND 2.0 / (24 * 60 * 60)
)
See a simplified demo.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | forpas |

