'Mysql Select INNER JOIN with order by very slow
I'm trying to speed up a mysql query. The Listings table has several million rows. If I don't sort them later I get the result in 0.1 seconds but once I sort it takes 7 seconds. What can I improve to speed up the query?
SELECT l.*
FROM listings l
INNER JOIN listings_categories lc
ON l.id=lc.list_id
AND lc.cat_id='2058'
INNER JOIN locations loc
ON l.location_id=loc.id
WHERE l.location_id
IN (7841,7842,7843,7844,7845,7846,7847,7848,7849,7850,7851,7852,7853,7854,7855,7856,7857,7858,7859,7860,7861,7862,7863,7864,7865,7866,7867,7868,7869,7870,7871,7872,7873,7874,7875,7876,7877,7878,7879,7880,7881,7882,7883,7884,7885,7886,7887,7888,7889,7890,7891,7892,7893,7894,7895,7896,7897,7898,7899,7900,7901,7902,7903)
ORDER BY date
DESC LIMIT 0,10;
EXPLAIN SELECT: Using Index l=date, loc=primary, lc=primary
Solution 1:[1]
Such performance questions are really difficult to answer and depend on the setup, indexes etc. So, there will likely not the one and only solution and even not really correct or incorrect attempts to improve the speed. This is a lof of try and error. Anyway, some points I noted which often cause performance issues are:
- Avoid conditions within joins that should be placed in the where instead. A join should contain the columns only that will be joined, no further conditions. So the "lc.cat_id='2058" should be put in the where clause.
- Using IN is often slow. You could try to replace it by using OR (l.location_id = 7841 OR location_id = 7842 OR...)
- Open the query execution plan and check whether there is something useful for you.
- Try to find out if there are special cases/values within the affected columns which slow down your query
- Change "ORDER BY date" to "ORDER BY tablealias.date" and check if this makes a difference in performance. Even if not, it is better to read.
- If you can rename the column "date", do this because using SQL keywords as table name or column name is no good idea. I'm unsure if this influences the performance, but it should be avoided if possible.
Good luck!
Solution 2:[2]
You can try additonal indexes to speed up the query, but you'll have a tradeoff when creating/manipulating data.
These combined keys could speed up the query:
listings: date, location_id
listings_categories: cat_id, list_id
Since the plan says it uses the date index, there wouldn't be a need to read the record to check the location_id when usign the new index, and same for the join with listinngs_category, index read would be enough
Solution 3:[3]
l: INDEX(location_id, id)
lc: INDEX(cat_id, list_id)
If those don't suffice, try the following rewrite.
SELECT l2.*
FROM
(
SELECT l1.id
FROM listings AS l1
JOIN listings_categories AS lc ON lc.list_id = l1.id
JOIN locations AS loc ON loc.id = l1.location_id
WHERE lc.cat_id='2058'
AND l1.location_id IN (7841, ..., 7903)
ORDER BY l1.date DESC
LIMIT 0,10
) AS x
JOIN listings l2 ON l1.id = x.id
ORDER BY l2.date DESC
With
listings: INDEX(location_id, date, id)
listings_categories: INDEX(cat_id, list_id)
The idea here is to get the 10 ids from the index before reaching to the table itself. Your version is probably shoveling around the whole table before sorting, and then delivering the 10.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | Turo |
| Solution 3 | Rick James |
