'Optimizer is not using index but the query is faster

I have this query and a cluster B-tree index on l_shipdate:

select
    sum(l_extendedprice*l_discount) as revenue
from
    lineitem
where
    l_shipdate>='01/01/1994'
and
    l_shipdate<'01/01/1995'
and
    l_discount between 0.04 and 0.07
and
    l_quantity<24

and if I EXPLAIN ANALYZE it I get

"Finalize Aggregate  (cost=1716219.02..1716219.03 rows=1 width=32) (actual time=9992.581..9992.976 rows=1 loops=1)"
"  ->  Gather  (cost=1716218.80..1716219.01 rows=2 width=32) (actual time=9992.490..9992.970 rows=3 loops=1)"
"        Workers Planned: 2"
"        Workers Launched: 2"
"        ->  Partial Aggregate  (cost=1715218.80..1715218.81 rows=1 width=32) (actual time=9989.613..9989.614 rows=1 loops=3)"
"              ->  Parallel Seq Scan on lineitem  (cost=0.00..1712504.18 rows=542925 width=12) (actual time=0.711..9931.443 rows=443110 loops=3)"
"                    Filter: ((l_shipdate >= '1994-01-01'::date) AND (l_shipdate < '1995-01-01'::date) AND (l_discount >= 0.04) AND (l_discount <= 0.07) AND (l_quantity < 24))"
"                    Rows Removed by Filter: 17050621"
"Planning Time: 0.127 ms"
"Execution Time: 9993.006 ms"

If I drop the clustered B-tree index from l_shipdate I get this plan:

"Finalize Aggregate  (cost=1716219.02..1716219.03 rows=1 width=32) (actual time=9621.221..9621.632 rows=1 loops=1)"
"  ->  Gather  (cost=1716218.80..1716219.01 rows=2 width=32) (actual time=9621.122..9621.626 rows=3 loops=1)"
"        Workers Planned: 2"
"        Workers Launched: 2"
"        ->  Partial Aggregate  (cost=1715218.80..1715218.81 rows=1 width=32) (actual time=9616.306..9616.307 rows=1 loops=3)"
"              ->  Parallel Seq Scan on lineitem  (cost=0.00..1712504.18 rows=542925 width=12) (actual time=0.669..9556.219 rows=443110 loops=3)"
"                    Filter: ((l_shipdate >= '1994-01-01'::date) AND (l_shipdate < '1995-01-01'::date) AND (l_discount >= 0.04) AND (l_discount <= 0.07) AND (l_quantity < 24))"
"                    Rows Removed by Filter: 17050621"
"Planning Time: 0.750 ms"
"Execution Time: 9621.674 ms"

I noticed that the two solutions take the same time (around 10 seconds) and the plan is identical. This means the optimizer is not using the clustered b-tree index on l_shipdate on the first scenario?

EDIT If I set enable_seqscan = OFF and re-run the query with EXPLAIN ANALYZE I get:

"Finalize Aggregate  (cost=1811005.28..1811005.29 rows=1 width=32) (actual time=34118.857..34120.450 rows=1 loops=1)"
"  ->  Gather  (cost=1811005.06..1811005.27 rows=2 width=32) (actual time=34118.442..34120.436 rows=3 loops=1)"
"        Workers Planned: 2"
"        Workers Launched: 2"
"        ->  Partial Aggregate  (cost=1810005.06..1810005.07 rows=1 width=32) (actual time=34110.199..34110.199 rows=1 loops=3)"
"              ->  Parallel Bitmap Heap Scan on lineitem  (cost=106009.94..1807290.43 rows=542925 width=12) (actual time=243.400..34016.240 rows=443110 loops=3)"
"                    Recheck Cond: ((l_shipdate >= '1994-01-01'::date) AND (l_shipdate < '1995-01-01'::date))"
"                    Rows Removed by Index Recheck: 12082549"
"                    Filter: ((l_discount >= 0.04) AND (l_discount <= 0.07) AND (l_quantity < 24))"
"                    Rows Removed by Filter: 2210062"
"                    Heap Blocks: exact=19826 lossy=340997"
"                    ->  Bitmap Index Scan on l_shipdate_idx  (cost=0.00..105684.18 rows=7889962 width=0) (actual time=243.006..243.006 rows=7959517 loops=1)"
"                          Index Cond: ((l_shipdate >= '1994-01-01'::date) AND (l_shipdate < '1995-01-01'::date))"
"Planning Time: 2.023 ms"
"Execution Time: 34120.876 ms"
  • Does this mean the Optimizer is not using the index on l_shipdate because l_shipdate>='01/01/1994'and l_shipdate<'01/01/1995' is not that selective as a condition? I guess the Optimizer prefers using the sequential scan on lineitem table, instead.


Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source