'delete all rows keep only one per hour group by strftime

I have a Home Assistant sqlite file.

My problem is that it grows fast and reaches enormous dimensions. So I made a php script that cleans it every 24 hours. I delete old and irrelevant data and leave only the data I need.

In addition I wanted to optimize the data I leave and delete for each sensor all the data except one value per hour, it will give me the information I need with all the graphs and reduce the file size.

I was thinking of using the not in option and wrote the following command:

 DELETE from states where state_id not in (select state_id, strftime ('% Y-% m-% d% H', last_changed, 'localtime') as date_hour from states group by entity_id, date_hour);

I sort all my sensors by sensor ID and by time and date and delete everything except one per hour.

But when I ran the command I got: Error: only a single result allowed for a SELECT that is part of an expression

Since I can not remove the date_hour column how do you guys think I should do it? Thank you!



Solution 1:[1]

The subquery returns 2 columns so you must compare 2 columns against its results:

DELETE FROM states 
WHERE (entity_id, last_changed) NOT IN (
  SELECT entity_id, MAX(last_changed)
  FROM states 
  GROUP BY entity_id, strftime('%Y-%m-%d% %H', last_changed) 
); 

See the demo.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 forpas