'Different speed of theoretically equal queries on MySQL

I have found a strange speed issue with one of my MySQL queries when run on two different columns, date_from vs date_to.

The table structure is the following:

create table if not exists table1 (
    id unsigned int,
    field2 int,
    field3 varchar(32),
    date_from date not null,
    date_to date not null,
    field6 text
);

create unique index idx_uniq_table1 on table1 (id, field2, field3, date_from);
create index idx_table1_id on table1 (id);
create index idx_table1_field2 on table1 (field2);
create index idx_table1_field3 on table1 (field3);
create index idx_table1_date_from on table1 (date_from);
create index idx_table1_date_to on table1 (date_to);

When I run this query using date_from, execution time is 1.487 seconds:

select field3, min(date_from) from table1 group by field3;

When I run this other query using date_to, execution time is 13.804 seconds, almost 10 times slower:

select field3, max(date_to) from table1 group by field3;

Both columns are NOT NULL, so there are no empty values.

The table has ~7M rows.

The only difference that I see between these two columns is that date_from appears in the unique index but, as far as I know, that should't make a difference if not filtering by all four columns in the index.

Am I missing anything?

This is the explain of the date_from column:

{
    "query_block": {
        "select_id": 1,
        "cost_info": {
        "query_cost": "889148.90"
        },
        "grouping_operation": {
            "using_filesort": false,
            "table": {
                "table_name": "table1",
                "access_type": "index",
                "possible_keys": [
                    "idx_uniq_table1",
                    "idx_table1_id",
                    "idx_table1_field2",
                    "idx_table1_field3",
                    "idx_table1_date_from",
                    "idx_table1_date_to"
                ],
                "key": "idx_table1_field3",
                "used_key_parts": [
                    "field3"
                ],
                "key_length": "130",
                "rows_examined_per_scan": 5952609,
                "rows_produced_per_join": 5952609,
                "filtered": "100.00",
                "using_index": true,
                "cost_info": {
                    "read_cost": "293888.00",
                    "eval_cost": "595260.90",
                    "prefix_cost": "889148.90",
                    "data_read_per_join": "908M"
                },
                "used_columns": [
                    "id",
                    "field2",
                    "field3",
                    "date_from"
                ]
            }
        }
    }
}

This is the explain of the date_to column:

{
    "query_block": {
        "select_id": 1,
        "cost_info": {
            "query_cost": "889148.90"
        },
        "grouping_operation": {
            "using_filesort": false,
            "table": {
                "table_name": "table1",
                "access_type": "index",
                "possible_keys": [
                    "idx_uniq_table1",
                    "idx_table1_id",
                    "idx_table1_field2",
                    "idx_table1_field3",
                    "idx_table1_date_from",
                    "idx_table1_date_to"
                ],
                "key": "idx_table1_field3",
                "used_key_parts": [
                    "field3"
                ],
                "key_length": "130",
                "rows_examined_per_scan": 5952609,
                "rows_produced_per_join": 5952609,
                "filtered": "100.00",
                "cost_info": {
                    "read_cost": "293888.00",
                    "eval_cost": "595260.90",
                    "prefix_cost": "889148.90",
                    "data_read_per_join": "908M"
                },
                "used_columns": [
                    "id",
                    "field2",
                    "field3",
                    "date_from",
                    "date_to"
                ]
            }
        }
    }
}

The only difference I see is in used_columns, at the end, where one contains date_to and the other doesn't.



Solution 1:[1]

In addition to Rick's answer about proper index based on what your criteria is... The reason for the speed difference is that the one index that had both the field3 and date_from, the engine was able to use the data within the actual index instead of having to go to the raw data pages that contain the entire record. The index that only had the date_to still had to go to every raw data record to get the field3, thus taking the time.

That is why you can utilize covering indexes. Having each component of data you are looking for to optimize the query. Not saying you want an index with 20 columns, but in this context of what might be common criteria for filtering is exactly why you do.

Solution 2:[2]

Remove arguments from win function and declare global score and count in win function

def win():
    global score
    global count

and then change all calls from win(score, card) => to win()

I tested your code for few rounds and it was increasing score by 2 per win for a player.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 DRapp
Solution 2 rg_dev