'Deleting duplicate values using time as criteria SQLite

After some research and being an extremely beginner at SQL I couldn't figure out how to solve this task.

The data set has duplicate data (CAPTION_TEXT), which i would like to clean BUT only should be consider duplicate data, rows within a 2 segs of difference(CAPTION_START).

Example: -Row N°9 and row N°19 have the same value. -Difference between Row's N°9 and Row's N°19 Caption_start <2 seconds. -There is a duplicate. It must be deleted.

enter image description here



Solution 1:[1]

Assuming that ID is the primary key of the table, you can use the function julianday():

DELETE FROM tablename AS t1
WHERE EXISTS (
  SELECT 1 
  FROM tablename AS t2 
  WHERE t2.ID <> t1.ID
    AND t2.CAPTION_TEXT = t1.CAPTION_TEXT
    AND julianday(t1.CAPTION_START) - julianday(t2.CAPTION_START) BETWEEN 0 AND 2.0 / (24 * 60 * 60)
)

If you want to insert the rows in a new table:

INSERT INTO new_table (ID, CAPTION_START, CAPTION_END, CAPTION_TEXT) 
SELECT * FROM tablename AS t1
WHERE NOT EXISTS (
  SELECT 1 
  FROM tablename AS t2 
  WHERE t2.ID <> t1.ID
    AND t2.CAPTION_TEXT = t1.CAPTION_TEXT
    AND julianday(t1.CAPTION_START) - julianday(t2.CAPTION_START) BETWEEN 0 AND 2.0 / (24 * 60 * 60)
)

See a simplified demo.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 forpas