'speed up complex query with joins

I am trying to make this query faster

SELECT
    contacts_cstm.case_id_c AS 'Case_id',
CONCAT(cont.first_name, ' ',cont.last_name) AS 'Contact Name',
contacts_cstm.id_c AS 'Id',
'Contacts' AS 'Module',
reso_tax_preparation_cstm.percentage_paid_c as '% Paid'
[... more cols in select ...]
FROM
contacts cont
INNER JOIN contacts_cstm on contacts_cstm.id_c = cont.id and cont.deleted=0
INNER JOIN leads on leads.contact_id = cont.id AND leads.deleted=0
LEFT JOIN leads_aggre_aggregatevalues_1_c ON leads.id = leads_aggre_aggregatevalues_1_c.leads_aggre_aggregatevalues_1leads_ida AND leads_aggre_aggregatevalues_1_c.deleted=0
LEFT JOIN aggre_aggregatevalues_cstm ON leads_aggre_aggregatevalues_1_c.leads_aggre_aggregatevalues_1aggre_aggregatevalues_idb = aggre_aggregatevalues_cstm.id_c 
Left JOIN contacts_ccs_ccs_1_c on contacts_ccs_ccs_1_c.contacts_ccs_ccs_1contacts_ida = cont.id AND contacts_ccs_ccs_1_c.deleted = 0
Left JOIN ccs_ccs on ccs_ccs.id = contacts_ccs_ccs_1_c.contacts_ccs_ccs_1ccs_ccs_idb AND ccs_ccs.deleted = 0
LEFT JOIN ccs_ccs_cstm on ccs_ccs_cstm.id_c = ccs_ccs.id
LEFT JOIN users advocate on advocate.id = contacts_cstm.user_id1_c
LEFT JOIN users practitioner on practitioner.id = contacts_cstm.user_id2_c
LEFT JOIN contacts_reso_tax_preparation_1_c on contacts_reso_tax_preparation_1_c.contacts_reso_tax_preparation_1contacts_ida = cont.id 
LEFT JOIN reso_tax_preparation_cstm on  reso_tax_preparation_cstm.id_c = contacts_reso_tax_preparation_1_c.contacts_reso_tax_preparation_1reso_tax_preparation_idb AND contacts_reso_tax_preparation_1_c.deleted=0
 
WHERE
(DATEDIFF(NOW(),contacts_cstm.last_contact_date_c) >= 14)  
                   
AND (reso_tax_preparation_cstm.percentage_paid_c != '100.00' OR (reso_tax_preparation_cstm.percentage_paid_c = '100.00' AND DATEDIFF(NOW(),contacts_cstm.last_contact_date_c) < 120) ) 

I tried following things

using Explain to see details table contacts_reso_tax_preparation_1_c used 113470 rows .

this is the explained output

enter image description here

so these condition

AND (reso_tax_preparation_cstm.percentage_paid_c != '100.00' OR (reso_tax_preparation_cstm.percentage_paid_c = '100.00' AND DATEDIFF(NOW(),contacts_cstm.last_contact_date_c) < 120) ) 

taking time

  1. moved deleted=0 condition from where condition to join condition

  2. Applied limit 1 to exist queries

  3. innodb to related tables

  4. applied indexed to following columns by these command

    ALTER TABLE reso_tax_preparation_cstm ADD INDEX(percentage_paid_c); ALTER TABLE contacts_cstm ADD INDEX( ctax_status_update_date_c, last_contact_date_c, resolution_type_c, resolution_service_level_c, misc_reso_service_level_c, tax_prep_missing_info_c); ALTER TABLE contacts_cstm ADD INDEX(user_id2_c); ALTER TABLE reso_tax_preparation_cstm ADD INDEX(service_status_c); ALTER TABLE reso_ancillary_services_cstm ADD INDEX(service_status_c); ALTER TABLE contacts_cstm ADD INDEX(missing_info_final_deadline_c);

  5. trying to split these queries into unions ,but count changes.

query still takes 9 seconds

  1. converted left joins to subqueries

    SELECT
     contacts_cstm.case_id_c AS 'Case_id',
      CONCAT( cont.first_name, ' ', cont.last_name ) AS 'Contact Name',
      contacts_cstm.id_c AS 'Id',
      'Contacts' AS 'Module',
       (select percentage_paid_c  AS '% Paid' from  
        contacts_reso_tax_preparation_1_c 
      LEFT JOIN
        reso_tax_preparation_cstm 
        ON reso_tax_preparation_cstm.id_c = contacts_reso_tax_preparation_1_c.contacts_reso_tax_preparation_1reso_tax_preparation_idb 
       AND contacts_reso_tax_preparation_1_c.deleted = 0 
    
        where 
        contacts_reso_tax_preparation_1_c.contacts_reso_tax_preparation_1contacts_ida = cont.id 
       AND
       ( 
           reso_tax_preparation_cstm.percentage_paid_c != '100.00' 
       OR
       (
          reso_tax_preparation_cstm.percentage_paid_c = '100.00' 
          AND DATEDIFF( NOW(), contacts_cstm.last_contact_date_c) < 120 
       )
    )
      limit 1
      ),
    
    DATEDIFF(NOW(), contacts_cstm.last_contact_date_c) AS 'Days since Last Contact',
    (
       SELECT
          COUNT(reqcl_requiredclientinfo.id) 
       FROM
          contacts 
          INNER JOIN
             contacts_cstm 
             ON contacts_cstm.id_c = contacts.id 
          LEFT JOIN
             contacts_reqcl_requiredclientinfo_1_c 
             ON contacts_reqcl_requiredclientinfo_1_c.contacts_reqcl_requiredclientinfo_1contacts_ida = contacts.id 
          LEFT JOIN
             reqcl_requiredclientinfo 
             ON reqcl_requiredclientinfo.id = contacts_reqcl_requiredclientinfo_1_c.contacts_reqcl_requiredclientinfo_1reqcl_requiredclientinfo_idb 
             AND reqcl_requiredclientinfo.deleted = 0 
          LEFT JOIN
             reqcl_requiredclientinfo_cstm 
             ON reqcl_requiredclientinfo_cstm.id_c = reqcl_requiredclientinfo.id 
       WHERE
          contacts.id = cont.id 
    )
    AS 'Number of Required Info Records',
    contacts_cstm.resolution_type_c AS 'Resolution Type',
    CONCAT( advocate.first_name, ' ', advocate.last_name ) AS 'Current Case Advocate',
    CONCAT( practitioner.first_name, ' ', advocate.last_name ) AS 'Current Practitioner',
    DATEDIFF( NOW(), contacts_cstm.ctax_status_update_date_c) AS 'Days in active Ctax Status ',
    contacts_cstm.resolution_service_level_c AS 'Resolution Service Level',
    contacts_cstm.tax_preparation_level_c AS 'Tax Prep Service Level',
    contacts_cstm.misc_reso_service_level_c AS 'Misc Res Service Level',
    ccs_ccs_cstm.ccs_status_c AS `CCS Status` 
    FROM
      contacts cont 
     INNER JOIN
       contacts_cstm 
       ON contacts_cstm.id_c = cont.id 
    INNER JOIN
       leads 
       ON leads.contact_id = cont.id 
    LEFT JOIN
       leads_aggre_aggregatevalues_1_c 
       ON leads.id = leads_aggre_aggregatevalues_1_c.leads_aggre_aggregatevalues_1leads_ida 
       AND leads_aggre_aggregatevalues_1_c.deleted = 0 
    LEFT JOIN
       aggre_aggregatevalues_cstm 
       ON leads_aggre_aggregatevalues_1_c.leads_aggre_aggregatevalues_1aggre_aggregatevalues_idb = aggre_aggregatevalues_cstm.id_c 
    LEFT JOIN
       contacts_ccs_ccs_1_c 
       ON contacts_ccs_ccs_1_c.contacts_ccs_ccs_1contacts_ida = cont.id 
       AND contacts_ccs_ccs_1_c.deleted = 0 
    LEFT JOIN
       ccs_ccs 
       ON ccs_ccs.id = contacts_ccs_ccs_1_c.contacts_ccs_ccs_1ccs_ccs_idb 
       AND ccs_ccs.deleted = 0 
    LEFT JOIN
       ccs_ccs_cstm 
       ON ccs_ccs_cstm.id_c = ccs_ccs.id 
    LEFT JOIN
       users advocate 
       ON advocate.id = contacts_cstm.user_id1_c 
    LEFT JOIN
       users practitioner 
       ON practitioner.id = contacts_cstm.user_id2_c 
    

this reduces time and count , but i wanted to be sure this is right

what more can i try



Solution 1:[1]

  • Avoid un-sargable constructs:

    AND  DATEDIFF(NOW(),contacts_cstm.last_contact_date_c) < 120) 
    

-->

  AND  contacts_cstm.last_contact_date_c > NOW() - INTERVAL 120 DAY
  • OR is hard to Optimize. If you can't get rid of it, we can talk about using UNION.

  • Remove any Joins that do not actually participate in the query.

  • Single-column indexes are sometimes not as good as 'composite' (multi-column) indexes.

  • Some of these indexes may help with the original query andor with the 'subquery' reformulation:

    aggre_aggregatevalues_cstm:  INDEX(id_c)
    ccs_ccs_cstm:  INDEX(id_c,  ccs_status_c)
    contacts_cstm:  INDEX(id_c)
    contacts_cstm:  INDEX(last_contact_date_c, id_c,  case_id_c, user_id1_c, user_id2_c)
    contacts_ccs_ccs_1_c:  INDEX(contacts_ccs_ccs_1contacts_ida,  deleted, contacts_ccs_ccs_1ccs_ccs_idb)
    contacts_reso_tax_preparation_1_c:  INDEX(contacts_reso_tax_preparation_1contacts_ida,  contacts_reso_tax_preparation_1reso_tax_preparation_idb, deleted)
    contacts_reqcl_requiredclientinfo_1_c:  INDEX(contacts_reqcl_requiredclientinfo_1contacts_ida,  contacts_reqcl_requiredclientinfo_1reqcl_requiredclientinfo_idb)
    leads:  INDEX(contact_id,  deleted, id)
    leads:  INDEX(contact_id,  id)
    leads_aggre_aggregatevalues_1_c:  INDEX(leads_aggre_aggregatevalues_1leads_ida,  deleted, leads_aggre_aggregatevalues_1aggre_aggregatevalues_idb)
    reso_tax_preparation_cstm:  INDEX(percentage_paid_c, id_c)
    reqcl_requiredclientinfo_cstm:  INDEX(id_c)
    

When adding a composite index, DROP index(es) with the same leading columns. That is, when you have both INDEX(a) and INDEX(a,b), toss the former.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1