'Why does PostgreSQL perform sequential scan on indexed column?

Very simple example - one table, one index, one query:

CREATE TABLE book
(
  id bigserial NOT NULL,
  "year" integer,
  -- other columns...
);

CREATE INDEX book_year_idx ON book (year)

EXPLAIN
 SELECT *
   FROM book b
  WHERE b.year > 2009

gives me:

Seq Scan on book b  (cost=0.00..25663.80 rows=105425 width=622)
  Filter: (year > 2009)

Why it does NOT perform index scan instead? What am I missing?



Solution 1:[1]

Did you ANALYZE the table/database? And what about the statistics? When there are many records where year > 2009, a sequential scan might be faster than an index scan.

Solution 2:[2]

@a_horse_with_no_name explained it quite well. Also if you really want to use an index scan, you should generally use bounded ranges in where clause. eg - year > 2019 and year < 2020.

A lot of the times statistics are not updated on a table and it may not be possible to do so due to constraints. In this case, the optimizer will not know how many rows it should take in year > 2019. Thus it selects a sequential scan in lieu of full knowledge. Bounded partitions will solve the problem most of the time.

Solution 3:[3]

In index scan, read head jumps from one row to another which is 1000 times slower than reading the next physical block (in the sequential scan).

So, if the (number of records to be retrieved * 1000) is less than the total number of records, the index scan will perform better.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Frank Heikens
Solution 2 Shitij Goyal
Solution 3 Gaurav Neema