'Postgres find rows with min/max value in column
I got huge event table with modified date column. And I need to find first and latest event. Queries looks like:
SELECT * FROM event WHERE modified = (SELECT MAX(modified) FROM event);
SELECT * FROM event WHERE modified = (SELECT MIN(modified) FROM event);
I want to speed up this queries. B-tree index on modified could help, but it's not looks like best solution. Even index scan requires O(logn) operations and index itself requires a lot of space, while to store only min and max values constant space is enough.
Is there a way to execute such queries in constant time? Or at lest in O(logn) time with constant space overhead?
Solution 1:[1]
If you do have an index on modified then this might be faster:
select *
from event
order by modified
limit 1
union all
select *
from event
order by modified desc
limit 1
Even index scan requires O(logn) operations
Not if the requirement is to find the highest or lowest value. That is essentially O(1)
Solution 2:[2]
If you have estimates which surely include the min and max, but surely exclude the vast majority of the table, you could use a partial index fence-posted by them. For example here, I know that at least one value is outside the range at each end:
create index on pgbench_accounts (abalance ) where abalance <10 or abalance >999990;
You need to add a spurious WHERE clause to get the index to be used.
select min(abalance) from pgbench_accounts where abalance<10;
I can't measure a difference in performance between the partial index and the full index, but the difference in size is hard to miss (32 MB vs 16 kB for a table with 2 million rows)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | a_horse_with_no_name |
| Solution 2 | jjanes |
