'MySQL returning only part of results from FULL TEXT SEARCH after version 5.6 to 5.7 upgrade
I have a query that consists of two full text searches in boolean mode (combined with OR operator) that worked just fine on MySQL 5.6 and that fails after bumping MySQL version 5.7. Both DBs have the exact same set of records, both are hosted on AWS (InnoDB, aurora).
Query below (don't pay too much attention to the table/column names as I tried to anonymise them):
SELECT
cars.id
FROM cars
INNER JOIN driver_licenses ON driver_licenses.car_id = cars.id
INNER JOIN drivers ON drivers.id = driver_licenses.driver_id AND drivers.noobie_driver = 0
WHERE (
(MATCH(cars.name) AGAINST ('mark*' IN BOOLEAN MODE))
OR (MATCH(drivers.first_name, drivers.last_name, drivers.email) AGAINST ('mark*' IN BOOLEAN MODE))
);
Of course I have the fulltext index on the [first_name, last_name, email] columns, as well a btree index on noobie_driver. There are two indices on cars.name - one btree and the other one fulltext.
Before the upgrade, query returned proper results (counted in hundreds compared to a few million records in total).
After the upgrade - it seems that the query/optimizer focuses only on the first condition and completely disregards the second full text search (by driver's names and email) and returns only few records - related directly to the search result of cars.name.
When queries are ran separately (first time for cars.name and then for drivers details) and then combined, they return same results as before the upgrade.
Also when I force to ignore noobie_driver index (or remove the noobie_driver condition), both full text search conditions are taken into consideration.
Running EXPLAIN in both DBs return the same results.
+----+-------------+---------------------------+------------+--------+------------------------------------------------------------------------------------------------------+---------------------------------------------------+---------+-----------------------------------------------------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------------------------+------------+--------+------------------------------------------------------------------------------------------------------+---------------------------------------------------+---------+-----------------------------------------------------+------+----------+-------------+
| 1 | SIMPLE | drivers | NULL | ref | PRIMARY,index_drivers_on_noobie_driver | index_drivers_on_noobie_driver | 1 | const | 6798 | 100.00 | NULL |
| 1 | SIMPLE | driver_licenses | NULL | ref | index_driver_licenses_on_car_id,index_driver_licenses_on_driver_id | index_driver_licenses_on_driver_id | 5 | Rental.drivers.id | 1 | 100.00 | Using where |
| 1 | SIMPLE | cars | NULL | eq_ref | PRIMARY | PRIMARY | 4 | Rental.driver_licenses.car_id | 1 | 100.00 | Using where |
+----+-------------+---------------------------+------------+--------+------------------------------------------------------------------------------------------------------+---------------------------------------------------+---------+-----------------------------------------------------+------+----------+-------------+
Tomorrow I'll be working on rebuilding the index/table(s) to see if that brings any changes to the behaviour on 5.7, once it's done I'll come back with more details. Running OPTIMIZE TABLE on all 3 tables haven't fixed anything here.
I'm wondering:
- Have I missed something and it is a feature now in 5.7 now that it behaves this way?
- How to overcome the issue and keep the exact same query (so without ignoring the index or performing two separate queries to combine the results afterwards)?
Solution 1:[1]
OK, dropping and recreating the index on noobie_driver column seems to do the trick on smoke-environment database that contains just few thousands records in the drivers table
DROP INDEX index_drivers_on_noobie_driver ON drivers;
CREATE INDEX index_drivers_on_noobie_driver USING BTREE ON drivers(noobie_driver);
BUT with production data that handles ~2kk records in the drivers" table, dropping and recreating an index did not help. I'm starting to believe it could be related to some bug strictly related to MySQL version.
Will be updating the question once I learn something new
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
