'PostgreSQL: Identical query on multiple tables sometimes much slower

I have a datawarehouse of financial data in a secluded postgresql environment. First things first: I can guarantee that no other processes are messing with my problem, because this DB is completely isolated from all other systems. I am running PG version 8.0.26 on a virtual Linux (CentOS) machine with 32GB RAM and a 4-core CPU.

Every day I need to update several thousand customer records. I do this with a query that runs for each customer in a For-loop, triggered externally from an R environment. Each iteration of this query is identical except for the customer ID and a date value in one of the where clauses. The query joins several very large tables (total exceeds 1 terabyte), all of which have indexes which usually guarantee a very quick execution of a few milliseconds per iteration. Sometimes however this query takes several seconds or even minutes to complete. This cannot be traced back to specific customer IDs or other values which may change for an individual iteration.

Since the execution is usually very quick, I don't think that my indexes are a problem, but something else that I can't put my finger on. All I can see is that at least the first couple of thousand iterations are usually very quick, with later iterations getting slower (as I said, only sometimes).

I have tried to analyze the problem using EXPLAIN ANALYZE but then the query in question is much quicker than when run in the aforementioned loop.

Any help is greatly appreciated!

Kind regards, Johannes

EDIT: This is the explain output of a typical query. It took more than 5 seconds in the regular loop but only a few ms when I ran the explain:

Hash Join  (cost=28.92..390.19 rows=1 width=167) (actual time=2.035..12.047 rows=193 loops=1)
  Hash Cond: (d.dataitemid = di.dataitemid)
  ->  Nested Loop  (cost=22.08..383.29 rows=15 width=138) (actual time=1.725..11.398 rows=1075 loops=1)
        ->  Nested Loop  (cost=21.37..23.22 rows=1 width=131) (actual time=1.516..1.650 rows=17 loops=1)
              ->  Hash Join  (cost=21.24..22.33 rows=1 width=122) (actual time=1.495..1.536 rows=17 loops=1)
                    Hash Cond: (pt.periodtypeid = finperiod.periodtypeid)
                    ->  Seq Scan on periodtype pt  (cost=0.00..1.06 rows=6 width=9) (actual time=0.037..0.043 rows=6 loops=1)
                    ->  Hash  (cost=21.23..21.23 rows=1 width=117) (actual time=1.375..1.382 rows=17 loops=1)
                          Buckets: 1024  Batches: 1  Memory Usage: 11kB
                          ->  Nested Loop  (cost=2.69..21.23 rows=1 width=117) (actual time=0.627..1.348 rows=17 loops=1)
                                ->  Nested Loop  (cost=2.55..20.48 rows=1 width=95) (actual time=0.592..1.264 rows=17 loops=1)
                                      ->  Nested Loop  (cost=2.41..19.75 rows=1 width=69) (actual time=0.567..1.182 rows=17 loops=1)
                                            Join Filter: (i.financialinstanceid = ic.financialinstanceid)
                                            ->  Nested Loop  (cost=1.83..8.66 rows=4 width=67) (actual time=0.462..0.679 rows=17 loops=1)
                                                  ->  Nested Loop  (cost=1.26..6.51 rows=1 width=57) (actual time=0.344..0.430 rows=3 loops=1)
                                                        ->  Nested Loop  (cost=1.13..5.62 rows=1 width=36) (actual time=0.317..0.391 rows=3 loops=1)
                                                              ->  Index Only Scan using finperiod_2 on finperiod  (cost=0.56..2.82 rows=1 width=10) (actual time=0.168..0.193 rows=3 loops=1)
                                                                    Index Cond: ((companyid = 874654) AND (fiscalyear = 2021) AND (fiscalquarter = 2))
                                                                    Filter: (periodtypeid = ANY ('{1,2,3,4}'::integer[]))
                                                                    Rows Removed by Filter: 1
                                                                    Heap Fetches: 0
                                                              ->  Index Scan using fininstance_2 on fininstance i  (cost=0.56..2.79 rows=1 width=30) (actual time=0.060..0.061 rows=1 loops=3)
                                                                    Index Cond: ((financialperiodid = finperiod.financialperiodid) AND (filingdate >= '2021-02-14 00:00:00'::timestamp without time zone) AND (filingdate <= '2022-03-22 00:00:00'::timestamp without time zone))
                                                                    Filter: ((instancetypeid = ANY ('{2,3}'::integer[])) AND (restatementtypeid = ANY ('{2,3,5}'::integer[])))
                                                        ->  Index Only Scan using fininstancetype_2 on fininstancetype it  (cost=0.13..0.70 rows=1 width=25) (actual time=0.010..0.010 rows=1 loops=3)
                                                              Index Cond: (instancetypeid = i.instancetypeid)
                                                              Heap Fetches: 3
                                                  ->  Index Only Scan using fincollection_1 on fincollection c  (cost=0.57..2.10 rows=5 width=10) (actual time=0.049..0.079 rows=6 loops=3)
                                                        Index Cond: (financialinstanceid = i.financialinstanceid)
                                                        Filter: (datacollectiontypeid = ANY ('{1,2,3,5,17,18,19}'::integer[]))
                                                        Rows Removed by Filter: 15
                                                        Heap Fetches: 0
                                            ->  Index Scan using pk_fininstancetocollection on fininstancetocollection ic  (cost=0.57..2.76 rows=1 width=10) (actual time=0.028..0.028 rows=1 loops=17)
                                                  Index Cond: ((financialinstanceid = c.financialinstanceid) AND (financialcollectionid = c.financialcollectionid))
                                                  Filter: (instancetocollectiontypeid = ANY ('{1,8,9,22}'::integer[]))
                                      ->  Index Scan using pk_findatacollectiontype on findatacollectiontype ct  (cost=0.14..0.71 rows=1 width=30) (actual time=0.003..0.003 rows=1 loops=17)
                                            Index Cond: (datacollectiontypeid = c.datacollectiontypeid)
                                ->  Index Scan using pk_fininstancetocollecttype on fininstancetocollecttype ict  (cost=0.14..0.71 rows=1 width=26) (actual time=0.004..0.004 rows=1 loops=17)
                                      Index Cond: (instancetocollectiontypeid = ic.instancetocollectiontypeid)
              ->  Index Scan using pk_restatementtype on restatementtype rt  (cost=0.13..0.70 rows=1 width=17) (actual time=0.004..0.004 rows=1 loops=17)
                    Index Cond: (restatementtypeid = i.restatementtypeid)
        ->  Index Scan using fincollectiondata_3 on fincollectiondata d  (cost=0.71..357.07 rows=300 width=15) (actual time=0.037..0.551 rows=63 loops=17)
              Index Cond: (financialcollectionid = c.financialcollectionid)
              Filter: (nmflag = 0)
              Rows Removed by Filter: 8
  ->  Hash  (cost=4.15..4.15 rows=215 width=33) (actual time=0.211..0.212 rows=215 loops=1)
        Buckets: 1024  Batches: 1  Memory Usage: 23kB
        ->  Seq Scan on tbl__felder di  (cost=0.00..4.15 rows=215 width=33) (actual time=0.029..0.103 rows=215 loops=1)
Planning Time: 567.738 ms
Execution Time: 25.120 ms


Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source