草庐IT

cost量化分析

GreatSQL 2023-03-28 原文
  • GreatSQL社区原创内容未经授权不得随意使用,转载请联系小编并注明来源。
  • GreatSQL是MySQL的国产分支版本,使用上与MySQL一致。
  • 作者: xryz
  • 文章来源:GreatSQL社区原创

前言:

我们在日常维护数据库的时候,经常会遇到查询慢的语句,这时候一般会通过执行EXPLAIN去查看它的执行计划,但是执行计划往往只给我们带来了最基础的分析信息,比如是否有使用索引,还有一些其他供我们分析的信息,比如使用了临时表、排序等等,却无法展示为什么一些其他的执行计划未被选择,比如说明明有索引,或者好几个索引,但是为什么查询时未使用到期望的索引等

explain select * from basic_person_info t1 join basic_person_info2 t2 on t1.id_num=t2.id_num where t1.age >10 and t2.age<20;
+----+-------------+-------+------------+--------+--------------------------------------+---------------+---------+----------------+------+----------+-----------------------+
| id | select_type | table | partitions | type   | possible_keys                        | key           | key_len | ref            | rows | filtered | Extra                 |
+----+-------------+-------+------------+--------+--------------------------------------+---------------+---------+----------------+------+----------+-----------------------+
|  1 | SIMPLE      | t2    | NULL       | range  | id_num_unique,idx_age,idx_age_id_num | idx_age       | 1       | NULL           | 9594 |   100.00 | Using index condition |
|  1 | SIMPLE      | t1    | NULL       | eq_ref | id_num_unique,idx_age                | id_num_unique | 60      | test.t2.id_num |    1 |    50.00 | Using where           |
+----+-------------+-------+------------+--------+--------------------------------------+---------------+---------+----------------+------+----------+-----------------------+
2 rows in set, 1 warning (0.01 sec)

如上面这个例子,为什么t2表上列出了多个可能使用的索引,却选择了idx_age,优化器为什么选择了指定的索引,这时候并不能直观的看出问题,这时候我们就可以开启optimizer_trace跟踪分析MySQL具体是怎么选择出最优的执行计划的。

OPTIMIZER_TRACE:

optimizer_trace是什么:

optimizer_trace是一个具有跟踪功能的工具,可以跟踪执行的语句的解析优化执行过程,并将跟踪到的信息记录到INFORMATION_SCHEMA.OPTIMIZER_TRACE表中,但是每个会话都只能跟踪它自己执行的语句,并且表中默认只记录最后一个查询的跟踪结果

使用方法:

# 打开optimizer trace功能 (默认情况下它是关闭的):
set optimizer_trace="enabled=on";
select ...; # 这里输入你自己的查询语句
SELECT * FROM INFORMATION_SCHEMA.OPTIMIZER_TRACE;
# 当你停止查看语句的优化过程时,把optimizer trace功能关闭
set optimizer_trace="enabled=off";

相关参数:

mysql>  show variables like '%optimizer_trace%';
+------------------------------+----------------------------------------------------------------------------+
| Variable_name                | Value                                                                      |
+------------------------------+----------------------------------------------------------------------------+
| optimizer_trace              | enabled=on,one_line=off                                                    |
| optimizer_trace_features     | greedy_search=on,range_optimizer=on,dynamic_range=on,repeated_subselect=on |
| optimizer_trace_limit        | 1                                                                          |
| optimizer_trace_max_mem_size | 1048576                                                                    |
| optimizer_trace_offset       | -1                                                                         |
+------------------------------+----------------------------------------------------------------------------+
  • optimizer_trace: enabled 开启/关闭optimizer_trace,one_line 是否单行显示,关闭为json模式,一般不开启
  • optimizer_trace_features:跟踪信息中可打印的项,一般不调整默认打印所有项
  • optimizer_trace_limit:存储的跟踪sql条数
  • optimizer_trace_offset:开始记录的sql语句的偏移量,负数表示从最近执行倒数第几条开始记录
  • optimizer_trace_max_mem_size:optimizer_trace内存的大小,如果跟踪信息超过这个大小,信息将会被截断

optimizer_trace表信息:

该表总共有4个字段

  • QUERY 表示我们的查询语句。
  • TRACE 表示优化过程的JSON格式文本。(重点关注)
  • MISSING_BYTES_BEYOND_MAX_MEM_SIZE 由于优化过程可能会输出很多,如果超过某个限制时,多余的文本将不会被显示,这个字段展示了被忽略的文本字节数。
  • INSUFFICIENT_PRIVILEGES 表示是否没有权限查看优化过程,默认值是0,只有某些特殊情况下才会是 1,我们暂时不关心这个字段的值。

信息解读:

通过 optimizer_trace表的query字段可以看到,一条语句的执行过程主要分为三个步骤:

"join_preparation": {},(准备阶段)
"join_optimization": {},(优化阶段)
"join_execution": {},(执行阶段)

各个步骤的详细内容解读:

  • preparation:
expanded_query :将语句进行格式化,补充隐藏的列名和表名等
transformations_to_nested_joins :查询重写,比如join的on改为where语句
  • optimization:
condition_processing{ :条件句处理。
    transformation{:转换类型句。这三次的转换分别是
        equality_propagation(等值条件句转换),如:a = b and b = c and c = 5
        constant_propagation(常量条件句转换),如:a = 1 AND b > a
        trivial_condition_removal(无效条件移除的转换),如:1 = 1
    }
}
substitute_generated_columns :替换虚拟生成列,测试了很多sql,这一列都没有看到有用的信息
table_dependencies :梳理表之间的依赖关系。
ref_optimizer_key_uses :如果优化器认为查询可以使用ref的话,在这里列出可以使用的索引。
rows_estimation{ :估算表行数和扫描的代价。如果查询中存在range扫描的话,对range扫描进行计划分析及代价估算。
  table_scan:全表扫描的行数(rows)以及所需要的代价(cost)。
  potential_range_indexes:该阶段会列出表中所有的索引并分析其是否可用,并且还会列出索引中可用的列字段。
  analyzing_range_alternatives :分析可选方案的代价。
}
considered_execution_plans{ :对比各可行计划的代价,选择相对最优的执行计划。
  plan_prefix:前置的执行计划。
  best_access_path:当前最优的执行顺序信息结果集。
  access_type表示使用索引的方式,可参照为explain中的type字段。
  condition_filtering_pct:类似于explain中的filtered列,这是一个估算值。
  rows_for_plan:该执行计划最终的扫描行数,这里的行数其实也是估算值,是由considered_access_paths的resulting_rows相乘之后再乘以condition_filtering_pct获得。
  cost_for_plan:该执行计划的执行代价,由considered_access_paths的cost相加而得。
  chosen:是否选择了该执行计划。
}
attaching_conditions_to_tables :添加附加条件,使得条件尽可能筛选单表数据。
refine_plan :优化后的执行计划。

trace信息中的json信息很长,因为我们关心的是不同执行计划的cost区别,所以只需要重点关注两个部分rows_estimation 和considered_execution_plans

代价模型计算:

统计信息和cost计算参数:

计算cost会涉及到表的主键索引数据页(聚簇索引)数量和表中的记录数,两个信息都可以通过innodb的表统计信息mysql.innodb_table_stats查到,n_rows是记录数,clustered_index_size是聚簇索引页数。

mysql> select * from mysql.innodb_table_stats where table_name='basic_person_info';
+---------------+-------------------+---------------------+--------+----------------------+--------------------------+
| database_name | table_name        | last_update         | n_rows | clustered_index_size | sum_of_other_index_sizes |
+---------------+-------------------+---------------------+--------+----------------------+--------------------------+
| test          | basic_person_info | 2022-12-23 18:27:24 |  86632 |                  737 |                     1401 |
+---------------+-------------------+---------------------+--------+----------------------+--------------------------+
1 row in set (0.01 sec)

代价模型将操作分为Server层和Engine(存储引擎)层两类,Server层主要是CPU代价,Engine层主要是IO代价,比如MySQL从磁盘读取一个数据页的代价io_block_read_cost为1,从buffer pool读取的代价memory_block_read_cost为0.25,计算符合条件的行代价为row_evaluate_cost为0.1,除此之外还有:

  • memory_temptable_create_cost (default 1.0) 内存临时表的创建代价。
  • memory_temptable_row_cost (default 0.1) 内存临时表的行代价。
  • key_compare_cost (default 0.1) 键比较的代价,例如排序。
  • disk_temptable_create_cost (default 20.0) 内部myisam或innodb临时表的创建代价。
  • disk_temptable_row_cost (default 0.5) 内部myisam或innodb临时表的行代价。

这些都可以通过mysql.server_cost、mysql.engine_cost查看defalt值和设置值

mysql> select * from mysql.server_cost;
+------------------------------+------------+---------------------+---------+---------------+
| cost_name                    | cost_value | last_update         | comment | default_value |
+------------------------------+------------+---------------------+---------+---------------+
| disk_temptable_create_cost   |       NULL | 2022-05-11 16:09:37 | NULL    |            20 |
| disk_temptable_row_cost      |       NULL | 2022-05-11 16:09:37 | NULL    |           0.5 |
| key_compare_cost             |       NULL | 2022-05-11 16:09:37 | NULL    |          0.05 |
| memory_temptable_create_cost |       NULL | 2022-05-11 16:09:37 | NULL    |             1 |
| memory_temptable_row_cost    |       NULL | 2022-05-11 16:09:37 | NULL    |           0.1 |
| row_evaluate_cost            |       NULL | 2022-05-11 16:09:37 | NULL    |           0.1 |
+------------------------------+------------+---------------------+---------+---------------+
mysql> select * from mysql.engine_cost;
+-------------+-------------+------------------------+------------+---------------------+---------+---------------+
| engine_name | device_type | cost_name              | cost_value | last_update         | comment | default_value |
+-------------+-------------+------------------------+------------+---------------------+---------+---------------+
| default     |           0 | io_block_read_cost     |       NULL | 2022-05-11 16:09:37 | NULL    |             1 |
| default     |           0 | memory_block_read_cost |       NULL | 2023-01-09 11:17:39 | NULL    |          0.25 |
+-------------+-------------+------------------------+------------+---------------------+---------+---------------+

计算公式:

如上面介绍的一样,代价模型将操作分为两类io_cost和cpu_cost,io_cost+cpu_cost就是总的cost,下面是具体的计算方法:

全表扫描:

全表扫描成本 = io_cost + 1.1 + cpu_cost + 1 (io_cost +1.1和cpu_cost +1在代码里是直接硬加上的,不知道为什么,计算的时候直接加上)

io_cost = clustered_index_size (统计信息中的主键页数) * avg_single_page_cost(读取一个页的平均成本)

avg_single_page_cost = pages_in_memory_percent * 0.25(memory_block_read_cost) + pages_on_disk_percent * 1.0(io_block_read_cost)

pages_in_memory_percent 表示已经加载到 Buffer Pool 中的叶结点占所有叶结点的比例 pages_on_disk_percent 表示没有加载到 Buffer Pool 中的叶结点占所有叶结点的比例

所以当数据已经全部读取到buffer pool中的时候:

io_cost=clustered_index_size * 0.25

都没有读取到buffer pool中的时候:

io_cost=clustered_index_size * 1.0

当部分数据在buffer pool中,部分数据需要从磁盘读取时,这时的系数介于0.25到1之间

cpu_cost = n_rows(统计信息中记录数) * 0.1(row_evaluate_cost)

走索引的成本:

和全表扫描的计算方法类似,其中io_cost与搜索的区间数有关,比如扫描三个区间where a between 1 and 10 or a between 20 and 30 or a between 40 and 50,此时:

io_cost=3 * avg_single_page_cost

cpu_cost=记录数 * 0.1(row_evaluate_cost)+0.01(代码中的微调参数)

针对二级索引还会有回表的操作:

MySQL认为每次回表都相当于是访问一个页面,所以每次回表都会进行一次IO,这部分成本:

io_cost=rows(记录数)*avg_single_page_cost

对回表查询的数据还需要进行一次计算:

cpu_cost=rows(记录数) * 0.1(row_evaluate_cost)(需要注意的是当索引需要回表扫描时,在rows_estimation阶段并不会计算这个值,在considered_execution_plans阶段会重新加上这部分成本)

所以针对需要回表的查询:

io_cost=查询区间 * avg_single_page_cost + rows(记录数) * avg_single_page_cost

cpu_cost=记录数 * 0.1(row_evaluate_cost) + 0.01(代码中的微调参数) + rows(记录数) * 0.1(row_evaluate_cost)

例子:

mysql> set optimizer_trace='enabled=on';
Query OK, 0 rows affected (0.00 sec)

mysql>explain select * from basic_person_info t1 join basic_person_info2 t2 on t1.id_num=t2.id_num where t1.age >10 and t2.age<20;
+----+-------------+-------+------------+--------+--------------------------------------+---------------+---------+----------------+------+----------+-----------------------+
| id | select_type | table | partitions | type   | possible_keys                        | key           | key_len | ref            | rows | filtered | Extra                 |
+----+-------------+-------+------------+--------+--------------------------------------+---------------+---------+----------------+------+----------+-----------------------+
|  1 | SIMPLE      | t2    | NULL       | range  | id_num_unique,idx_age,idx_age_id_num | idx_age       | 1       | NULL           | 9594 |   100.00 | Using index condition |
|  1 | SIMPLE      | t1    | NULL       | eq_ref | id_num_unique,idx_age                | id_num_unique | 60      | test.t2.id_num |    1 |    50.00 | Using where           |
+----+-------------+-------+------------+--------+--------------------------------------+---------------+---------+----------------+------+----------+-----------------------+
2 rows in set, 1 warning (0.04 sec)

查看optimizer_trace的内容

select * from basic_person_info t1 join basic_person_info2 t2 on t1.id_num=t2.id_num where t1.age >10 and t2.age<20 | {
  "steps": [
    {
      "join_preparation": {
        "select#": 1,
        "steps": [
          {
            "expanded_query": "/* select#1 */ select `t1`.`id` AS `id`,`t1`.`id_num` AS `id_num`,`t1`.`lastname` AS `lastname`,`t1`.`firstname` AS `firstname`,`t1`.`mobile` AS `mobile`,`t1`.`sex` AS `sex`,`t1`.`birthday` AS `birthday`,`t1`.`age` AS `age`,`t1`.`top_education` AS `top_education`,`t1`.`address` AS `address`,`t1`.`income_by_year` AS `income_by_year`,`t1`.`create_time` AS `create_time`,`t1`.`update_time` AS `update_time`,`t2`.`id` AS `id`,`t2`.`id_num` AS `id_num`,`t2`.`lastname` AS `lastname`,`t2`.`firstname` AS `firstname`,`t2`.`mobile` AS `mobile`,`t2`.`sex` AS `sex`,`t2`.`birthday` AS `birthday`,`t2`.`age` AS `age`,`t2`.`top_education` AS `top_education`,`t2`.`address` AS `address`,`t2`.`income_by_year` AS `income_by_year`,`t2`.`create_time` AS `create_time`,`t2`.`update_time` AS `update_time` from (`basic_person_info` `t1` join `basic_person_info2` `t2` on((`t1`.`id_num` = `t2`.`id_num`))) where ((`t1`.`age` > 10) and (`t2`.`age` < 20))"
          },
          {
            "transformations_to_nested_joins": {
              "transformations": [
                "JOIN_condition_to_WHERE",
                "parenthesis_removal"
              ],
              "expanded_query": "/* select#1 */ select `t1`.`id` AS `id`,`t1`.`id_num` AS `id_num`,`t1`.`lastname` AS `lastname`,`t1`.`firstname` AS `firstname`,`t1`.`mobile` AS `mobile`,`t1`.`sex` AS `sex`,`t1`.`birthday` AS `birthday`,`t1`.`age` AS `age`,`t1`.`top_education` AS `top_education`,`t1`.`address` AS `address`,`t1`.`income_by_year` AS `income_by_year`,`t1`.`create_time` AS `create_time`,`t1`.`update_time` AS `update_time`,`t2`.`id` AS `id`,`t2`.`id_num` AS `id_num`,`t2`.`lastname` AS `lastname`,`t2`.`firstname` AS `firstname`,`t2`.`mobile` AS `mobile`,`t2`.`sex` AS `sex`,`t2`.`birthday` AS `birthday`,`t2`.`age` AS `age`,`t2`.`top_education` AS `top_education`,`t2`.`address` AS `address`,`t2`.`income_by_year` AS `income_by_year`,`t2`.`create_time` AS `create_time`,`t2`.`update_time` AS `update_time` from `basic_person_info` `t1` join `basic_person_info2` `t2` where ((`t1`.`age` > 10) and (`t2`.`age` < 20) and (`t1`.`id_num` = `t2`.`id_num`))"
            }
          }
        ]
      }
    },
    {
      "join_optimization": {
        "select#": 1,
        "steps": [
          {
            "condition_processing": {
              "condition": "WHERE",
              "original_condition": "((`t1`.`age` > 10) and (`t2`.`age` < 20) and (`t1`.`id_num` = `t2`.`id_num`))",
              "steps": [
                {
                  "transformation": "equality_propagation",
                  "resulting_condition": "((`t1`.`age` > 10) and (`t2`.`age` < 20) and multiple equal(`t1`.`id_num`, `t2`.`id_num`))"
                },
                {
                  "transformation": "constant_propagation",
                  "resulting_condition": "((`t1`.`age` > 10) and (`t2`.`age` < 20) and multiple equal(`t1`.`id_num`, `t2`.`id_num`))"
                },
                {
                  "transformation": "trivial_condition_removal",
                  "resulting_condition": "((`t1`.`age` > 10) and (`t2`.`age` < 20) and multiple equal(`t1`.`id_num`, `t2`.`id_num`))"
                }
              ]
            }
          },
          {
            "substitute_generated_columns": {
            }
          },
          {
            "table_dependencies": [
              {
                "table": "`basic_person_info` `t1`",
                "row_may_be_null": false,
                "map_bit": 0,
                "depends_on_map_bits": [
                ]
              },
              {
                "table": "`basic_person_info2` `t2`",
                "row_may_be_null": false,
                "map_bit": 1,
                "depends_on_map_bits": [
                ]
              }
            ]
          },
          {
            "ref_optimizer_key_uses": [
              {
                "table": "`basic_person_info` `t1`",
                "field": "id_num",
                "equals": "`t2`.`id_num`",
                "null_rejecting": true
              },
              {
                "table": "`basic_person_info2` `t2`",
                "field": "id_num",
                "equals": "`t1`.`id_num`",
                "null_rejecting": true
              }
            ]
          },
          {
            "rows_estimation": [
              {
                "table": "`basic_person_info` `t1`",
                "range_analysis": {
                  "table_scan": {
                    "rows": 86734,
                    "cost": 8859.75
                    
                    t1表的scan成本=聚簇索引页数*0.25 + 行数 * 0.1 +1.1+1
                    737*0.25+1.1+86734*0.1+1=8859.75
                    
                  },
                  "potential_range_indexes": [
                    {
                      "index": "PRIMARY",
                      "usable": false,
                      "cause": "not_applicable"
                    },
                    {
                      "index": "id_num_unique",
                      "usable": false,
                      "cause": "not_applicable"
                    },
                    {
                      "index": "mobile_unique",
                      "usable": false,
                      "cause": "not_applicable"
                    },
                    {
                      "index": "idx_basic_person_info_name",
                      "usable": false,
                      "cause": "not_applicable"
                    },
                    {
                      "index": "idx_basic_person_info_top_education",
                      "usable": false,
                      "cause": "not_applicable"
                    },
                    {
                      "index": "idx_basic_person_info_create_time",
                      "usable": false,
                      "cause": "not_applicable"
                    },
                    {
                      "index": "idx_basic_person_info_mobile",
                      "usable": false,
                      "cause": "not_applicable"
                    },
                    {
                      "index": "idx_age",
                      "usable": true,
                      "key_parts": [
                        "age",
                        "id"
                      ]
                    }
                  ],
                  "setup_range_conditions": [
                  ],
                  "group_index_range": {
                    "chosen": false,
                    "cause": "not_single_table"
                  },
                  "skip_scan_range": {
                    "chosen": false,
                    "cause": "not_single_table"
                  },
                  "analyzing_range_alternatives": {
                    "range_scan_alternatives": [
                      {
                        "index": "idx_age",
                        "ranges": [
                          "10 < age"
                        ],
                        "index_dives_for_eq_ranges": true,
                        "rowid_ordered": false,
                        "using_mrr": false,
                        "index_only": false,
                        "rows": 43367,
                        "cost": 15178.7,
                        
                        通过索引idx_age读取数据:
                        io_cost=区间数* 0.25 +记录数* 0.25
                        io_cost=1*0.25+43367*0.25=10,842  
                        cpu_cost=记录数* 0.1 (没有回表的cost)
                        cpu_cost=43367*0.1=4,336.7 
                        cost=10842+4,336.7=15178.7
                        
                        "chosen": false,
                        "cause": "cost"
                      }
                    ],
                    "analyzing_roworder_intersect": {
                      "usable": false,
                      "cause": "too_few_roworder_scans"
                    }
                  }
                }
              },
              {
                "table": "`basic_person_info2` `t2`",
                "range_analysis": {
                  "table_scan": {
                    "rows": 73845,
                    "cost": 7538.85
                    
                    t2表的scan成本=聚簇索引页数*0.25 + 行数 * 0.1 +1.1+1
                    609*0.25+1+73845*0.1+1.1=7538.85
                    
                  },
                  "potential_range_indexes": [
                    {
                      "index": "PRIMARY",
                      "usable": false,
                      "cause": "not_applicable"
                    },
                    {
                      "index": "id_num_unique",
                      "usable": false,
                      "cause": "not_applicable"
                    },
                    {
                      "index": "mobile_unique",
                      "usable": false,
                      "cause": "not_applicable"
                    },
                    {
                      "index": "idx_basic_person_info_name",
                      "usable": false,
                      "cause": "not_applicable"
                    },
                    {
                      "index": "idx_basic_person_info_top_education",
                      "usable": false,
                      "cause": "not_applicable"
                    },
                    {
                      "index": "idx_basic_person_info_create_time",
                      "usable": false,
                      "cause": "not_applicable"
                    },
                    {
                      "index": "idx_basic_person_info_mobile",
                      "usable": false,
                      "cause": "not_applicable"
                    },
                    {
                      "index": "idx_age",
                      "usable": true,
                      "key_parts": [
                        "age",
                        "id"
                      ]
                    },
                    {
                      "index": "idx_age_id_num",
                      "usable": true,
                      "key_parts": [
                        "age",
                        "id_num",
                        "id"
                      ]
                    }
                  ],
                  "setup_range_conditions": [
                  ],
                  "group_index_range": {
                    "chosen": false,
                    "cause": "not_single_table"
                  },
                  "skip_scan_range": {
                    "chosen": false,
                    "cause": "not_single_table"
                  },
                  "analyzing_range_alternatives": {
                    "range_scan_alternatives": [
                      {
                        "index": "idx_age",
                        "ranges": [
                          "age < 20"
                        ],
                        "index_dives_for_eq_ranges": true,
                        "rowid_ordered": false,
                        "using_mrr": false,
                        "index_only": false,
                        "rows": 9594,
                        "cost": 3358.16,
                        
                        通过索引idx_age读取数据:
                        io_cost=区间数* 0.25 +记录数* 0.25
                        io_cost=1*0.25+9594*0.25=2,398.75        
                        cpu_cost=记录数* 0.1   (没有回表的cost) 
                        cpu_cost=9594*0.1959.4  
                        cost=2,398.75+959.4=3,358.15
                        
                        "chosen": true
                      },
                      {
                        "index": "idx_age_id_num",
                        "ranges": [
                          "age < 20"
                        ],
                        "index_dives_for_eq_ranges": true,
                        "rowid_ordered": false,
                        "using_mrr": false,
                        "index_only": false,
                        "rows": 19086,
                        "cost": 6680.36,
                        
                        通过索引idx_age_id_num读取数据:
                        io_cost=区间数* 0.25 +记录数* 0.25
                        io_cost=1*0.25+19086*0.25=4,771.75           
                        cpu_cost=记录数* 0.1  (没有回表的cost)  
                        cpu_cost=19086*0.1=1908.6
                        cost=4,771.75+1908.6=6,680.35
                        
                        "chosen": false,
                        "cause": "cost"
                      }
                    ],
                    "analyzing_roworder_intersect": {
                      "usable": false,
                      "cause": "too_few_roworder_scans"
                    }
                  },
                  "chosen_range_access_summary": {
                    "range_access_plan": {
                      "type": "range_scan",
                      "index": "idx_age",
                      "rows": 9594,
                      "ranges": [
                        "age < 20"
                      ]
                    },
                    "rows_for_plan": 9594,
                    "cost_for_plan": 3358.16,
                    "chosen": true
                  }
                }
              }
            ]
          },
          {
            "considered_execution_plans": [
              {
                "plan_prefix": [
                ],
                "table": "`basic_person_info2` `t2`",
                "best_access_path": {
                  "considered_access_paths": [
                    {
                      "access_type": "ref",
                      "index": "id_num_unique",
                      "usable": false,
                      "chosen": false
                    },
                    {
                      "rows_to_scan": 9594,
                      "filtering_effect": [
                      ],
                      "final_filtering_effect": 1,
                      "access_type": "range",
                      "range_details": {
                        "used_index": "idx_age"
                      },
                      "resulting_rows": 9594,
                      "cost": 4317.56,
                      
                        通过索引idx_age读取数据:
                        io_cost=区间数* 0.25 +记录数* 0.25
                        io_cost=1*0.25+9594*0.25=2,398.75        
                        cpu_cost=记录数* 0.1 + 记录数* 0.1   
                        cpu_cost=9594*0.1*2=1,918.8  
                        cost=2,398.75+1,918.8=4317.56
                      
                      "chosen": true
                    }
                  ]
                },
                "condition_filtering_pct": 100,
                "rows_for_plan": 9594,
                "cost_for_plan": 4317.56,
                "rest_of_plan": [
                  {
                    "plan_prefix": [
                      "`basic_person_info2` `t2`"
                    ],
                    "table": "`basic_person_info` `t1`",
                    "best_access_path": {
                      "considered_access_paths": [
                        {
                          "access_type": "eq_ref",
                          "index": "id_num_unique",
                          "rows": 1,
                          "cost": 3357.9,
                          
                          io_cost=t2表记录数*0.25=9594*0.25=2398.5
                          cpu_cost=记录数*0.1=9594*0.1=959.4
                          cost=2398.5+959.4=3357.9
                          
                          "chosen": true
                        },
                        {
                          "rows_to_scan": 86734,
                          "filtering_effect": [
                          ],
                          "final_filtering_effect": 0.5,
                          "access_type": "scan",
                          "using_join_cache": true,
                          "buffers_needed": 14,
                          "resulting_rows": 43367,
                          "cost": 4.16701e+07,
                          "chosen": false
                        }
                      ]
                    },
                    "condition_filtering_pct": 100,
                    "rows_for_plan": 9594,
                    "cost_for_plan": 7675.46,
                    
                   总cost=4,317.56+3,357.9=7,675.46
                   
                    "chosen": true
                  }
                ]
              },
              {
                "plan_prefix": [
                ],
                "table": "`basic_person_info` `t1`",
                "best_access_path": {
                  "considered_access_paths": [
                    {
                      "access_type": "ref",
                      "index": "id_num_unique",
                      "usable": false,
                      "chosen": false
                    },
                    {
                      "rows_to_scan": 86734,
                      "filtering_effect": [
                      ],
                      "final_filtering_effect": 0.5,
                      "access_type": "scan",
                      "resulting_rows": 43367,
                      "cost": 8857.65,
                      
                      t1的scan成本
                      
                      "chosen": true
                    }
                  ]
                },
                "condition_filtering_pct": 100,
                "rows_for_plan": 43367,
                "cost_for_plan": 8857.65,
                "pruned_by_cost": true
                
                放弃后续的计算
                
              }
            ]
          },
          {
            "attaching_conditions_to_tables": {
              "original_condition": "((`t1`.`id_num` = `t2`.`id_num`) and (`t1`.`age` > 10) and (`t2`.`age` < 20))",
              "attached_conditions_computation": [
              ],
              "attached_conditions_summary": [
                {
                  "table": "`basic_person_info2` `t2`",
                  "attached": "(`t2`.`age` < 20)"
                },
                {
                  "table": "`basic_person_info` `t1`",
                  "attached": "((`t1`.`id_num` = `t2`.`id_num`) and (`t1`.`age` > 10))"
                }
              ]
            }
          },
          {
            "finalizing_table_conditions": [
              {
                "table": "`basic_person_info2` `t2`",
                "original_table_condition": "(`t2`.`age` < 20)",
                "final_table_condition   ": "(`t2`.`age` < 20)"
              },
              {
                "table": "`basic_person_info` `t1`",
                "original_table_condition": "((`t1`.`id_num` = `t2`.`id_num`) and (`t1`.`age` > 10))",
                "final_table_condition   ": "(`t1`.`age` > 10)"
              }
            ]
          },
          {
            "refine_plan": [
              {
                "table": "`basic_person_info2` `t2`",
                "pushed_index_condition": "(`t2`.`age` < 20)",
                "table_condition_attached": null
              },
              {
                "table": "`basic_person_info` `t1`"
              }
            ]
          }
        ]
      }
    },
    {
      "join_execution": {
        "select#": 1,
        "steps": [
        ]
      }
    }
  ]
}

成本常数修改:

前面已经介绍了成本常量值实际上存放在MySQL自带的系统库MySQL中的server_cost和engine_cost表中,其中server_cost表存放server层的成本常量,engine_cost表存放engine层成本常量

mysql> select * from mysql.server_cost;
+------------------------------+------------+---------------------+---------+---------------+
| cost_name                    | cost_value | last_update         | comment | default_value |
+------------------------------+------------+---------------------+---------+---------------+
| disk_temptable_create_cost   |       NULL | 2022-05-11 16:09:37 | NULL    |            20 |
| disk_temptable_row_cost      |       NULL | 2022-05-11 16:09:37 | NULL    |           0.5 |
| key_compare_cost             |       NULL | 2022-05-11 16:09:37 | NULL    |          0.05 |
| memory_temptable_create_cost |       NULL | 2022-05-11 16:09:37 | NULL    |             1 |
| memory_temptable_row_cost    |       NULL | 2022-05-11 16:09:37 | NULL    |           0.1 |
| row_evaluate_cost            |       NULL | 2022-05-11 16:09:37 | NULL    |           0.1 |
+------------------------------+------------+---------------------+---------+---------------+

mysql> select * from mysql.engine_cost;
+-------------+-------------+------------------------+------------+---------------------+---------+---------------+
| engine_name | device_type | cost_name              | cost_value | last_update         | comment | default_value |
+-------------+-------------+------------------------+------------+---------------------+---------+---------------+
| default     |           0 | io_block_read_cost     |       NULL | 2022-05-11 16:09:37 | NULL    |             1 |
| default     |           0 | memory_block_read_cost |       NULL | 2023-01-09 11:17:39 | NULL    |          0.25 |
+-------------+-------------+------------------------+------------+---------------------+---------+---------------+

其中 default_value的值是系统默认的,不能修改,cost_value列的值我们可以修改,如果cost_value列的值不为空系统将用该值覆盖默认值,我们可以通过update语句来修改

mysql> update mysql.engine_cost set cost_value=10 where cost_name='memory_block_read_cost';
Query OK, 0 rows affected (0.00 sec)
mysql> update mysql.engine_cost set cost_value=10 where cost_name='io_block_read_cost';
Query OK, 0 rows affected (0.00 sec)

很多资料都说执行flush optimizer_costs就可以生效,不过我在修改完后并执行flush optimizer_costs并不能马上生效,最后是通过重启数据库实例才生效,这个可能是数据库版本的差异,大家可以自行验证。

mysql> explain select * from basic_person_info t1 join basic_person_info2 t2 on t1.id_num=t2.id_num where t1.age >10 and t2.age<20;
+----+-------------+-------+------------+--------+--------------------------------------+---------------+---------+----------------+-------+----------+-------------+
| id | select_type | table | partitions | type   | possible_keys                        | key           | key_len | ref            | rows  | filtered | Extra       |
+----+-------------+-------+------------+--------+--------------------------------------+---------------+---------+----------------+-------+----------+-------------+
|  1 | SIMPLE      | t2    | NULL       | ALL    | id_num_unique,idx_age,idx_age_id_num | NULL          | NULL    | NULL           | 73990 |    12.97 | Using where |
|  1 | SIMPLE      | t1    | NULL       | eq_ref | id_num_unique,idx_age                | id_num_unique | 60      | test.t2.id_num |     1 |    50.00 | Using where |
+----+-------------+-------+------------+--------+--------------------------------------+---------------+---------+----------------+-------+----------+-------------+

"table": "`basic_person_info2` `t2`",
                "range_analysis": {
                  "table_scan": {
                    "rows": 73990,
                    "cost": 13491.1
                   
                   全表扫描cost=609*10+73990*0.1+1.1+1= 13491.1
                   
                  },
"index": "idx_age",
                        "ranges": [
                          "age < 20"
                        ],
                        "index_dives_for_eq_ranges": true,
                        "rowid_ordered": false,
                        "using_mrr": false,
                        "index_only": false,
                        "rows": 9594,
                        "cost": 96909.4,
                        
                        idx_age索引扫描cost=1*10+9594*10+9594*0.1=96,909.4
                        
                        "chosen": false,
                        "cause": "cost"
                      },

修改后的执行计划,发现t2表走了全表扫描了而没有走idx_age索引,分别查看一下t2表走全表扫描和idx_age索引的cost发现全表扫描的cost为13491.1,而走索引的cost为96,909.4,因为全表扫描的cost比走索引低,所以优化器没有选择idx_age索引。

从这个例子可以看出,更改成本常量值会直接影响优化器的方案选择,所以一定要慎重,没有特殊原因建议不要修改。

explain format=json

虽然通过optimizer_trace可以看到很多详细的优化器选择过程,但是使用起来起来还是比较麻烦,需要过滤的信息很多,这时explain format=json输出json格式的分析数据也是一个不错的选择,它也包含语句将要执行的成本信息,如下:

query_cost  总查询成本
read_cost   IO成本+除 eval_cost以外cpu成本
eval_cost   检测rows * filter条记录的成本
prefix_cost 单次查询的成本,等于read_cost+eval_cost
mysql> explain format=json select * from basic_person_info t1 join basic_person_info2 t2 on t1.id_num=t2.id_num where t1.age >10 and t2.age<20;
{
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "7675.46"
    },
    "nested_loop": [
      {
        "table": {
          "table_name": "t2",
          "access_type": "range",
          "possible_keys": [
            "id_num_unique",
            "idx_age",
            "idx_age_id_num"
          ],
          "key": "idx_age",
          "used_key_parts": [
            "age"
          ],
          "key_length": "1",
          "rows_examined_per_scan": 9594,
          "rows_produced_per_join": 9594,
          "filtered": "100.00",
          "index_condition": "(`test`.`t2`.`age` < 20)",
          "cost_info": {
            "read_cost": "3358.16",
            包含所有io成本+(cpu成本-eval_cost)
            "eval_cost": "959.40",
            计算扇出的cpu成本,优化器利用启发式规则估算出满足所有条件的的比例(filtered)
            =rows_examined_per_scan*filtered*0.1
            "prefix_cost": "4317.56",
            单表查询的总成本
            
            "data_read_per_join": "3M"
          },
          "used_columns": [
            "id",
            "id_num",
            "lastname",
            "firstname",
            "mobile",
            "sex",
            "birthday",
            "age",
            "top_education",
            "address",
            "income_by_year",
            "create_time",
            "update_time"
          ]
        }
      },
      {
        "table": {
          "table_name": "t1",
          "access_type": "eq_ref",
          "possible_keys": [
            "id_num_unique",
            "idx_age"
          ],
          "key": "id_num_unique",
          "used_key_parts": [
            "id_num"
          ],
          "key_length": "60",
          "ref": [
            "test.t2.id_num"
          ],
          "rows_examined_per_scan": 1,
          "rows_produced_per_join": 4797,
          "filtered": "50.00",
          "cost_info": {
            "read_cost": "2398.50",
            包含所有io成本+(cpu成本-eval_cost)
            "eval_cost": "479.70",
            计算扇出的cpu成本,优化器利用启发式规则估算出满足所有条件的的比例(filtered)
            =rows_examined_per_scan*filtered*0.1
            "prefix_cost": "7675.46",
            两表查询的总cost
            "data_read_per_join": "1M"
          },
          "used_columns": [
            "id",
            "id_num",
            "lastname",
            "firstname",
            "mobile",
            "sex",
            "birthday",
            "age",
            "top_education",
            "address",
            "income_by_year",
            "create_time",
            "update_time"
          ],
          "attached_condition": "(`test`.`t1`.`age` > 10)"
        }
      }
    ]
  }
}

另外,explain结合show warnings语句一起使用还可以得知优化器改写后的语句

mysql> show warnings;
+-------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
+-------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Note  | 1003 | /* select#1 */ select `test`.`t1`.`id` AS `id`,`test`.`t1`.`id_num` AS `id_num`,`test`.`t1`.`lastname` AS `lastname`,`test`.`t1`.`firstname` AS `firstname`,`test`.`t1`.`mobile` AS `mobile`,`test`.`t1`.`sex` AS `sex`,`test`.`t1`.`birthday` AS `birthday`,`test`.`t1`.`age` AS `age`,`test`.`t1`.`top_education` AS `top_education`,`test`.`t1`.`address` AS `address`,`test`.`t1`.`income_by_year` AS `income_by_year`,`test`.`t1`.`create_time` AS `create_time`,`test`.`t1`.`update_time` AS `update_time`,`test`.`t2`.`id` AS `id`,`test`.`t2`.`id_num` AS `id_num`,`test`.`t2`.`lastname` AS `lastname`,`test`.`t2`.`firstname` AS `firstname`,`test`.`t2`.`mobile` AS `mobile`,`test`.`t2`.`sex` AS `sex`,`test`.`t2`.`birthday` AS `birthday`,`test`.`t2`.`age` AS `age`,`test`.`t2`.`top_education` AS `top_education`,`test`.`t2`.`address` AS `address`,`test`.`t2`.`income_by_year` AS `income_by_year`,`test`.`t2`.`create_time` AS `create_time`,`test`.`t2`.`update_time` AS `update_time` from `test`.`basic_person_info` `t1` join `test`.`basic_person_info2` `t2` where ((`test`.`t1`.`id_num` = `test`.`t2`.`id_num`) and (`test`.`t1`.`age` > 10) and (`test`.`t2`.`age` < 20)) |
+-------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

总结:

  • MySQL的优化器是基于成本来选择最优执行方案的,哪个成本最少就选哪个,所以重点在于计算出各个执行计划的cost
  • 成本由CPU成本和IO成本组成,每个成本常数值可以自己调整,非必要的情况下不要调整,以免影响整个数据库的执行计划选择
  • 通过开启optimizer_trace可以跟踪优化器的各个环节的分析步骤,可以判断有时候为什么没有走索引而走了全表扫描
  • explain加上format=json选项后可以查看成本信息分为read_cost和eval_cost,但只能看到当前已经选择的执行计划,另外通过show warnings可以看到优化器改写后的语句

Enjoy GreatSQL ?

\


Enjoy GreatSQL ?

关于 GreatSQL

GreatSQL是由万里数据库维护的MySQL分支,专注于提升MGR可靠性及性能,支持InnoDB并行查询特性,是适用于金融级应用的MySQL分支版本。

相关链接: GreatSQL社区 Gitee GitHub Bilibili

GreatSQL社区:

社区博客有奖征稿详情:https://greatsql.cn/thread-100-1-1.html

技术交流群:

微信:扫码添加GreatSQL社区助手微信好友,发送验证信息加群

有关cost量化分析的更多相关文章

  1. ruby-on-rails - Ruby on Rails : . 常量化 : wrong constant name error? - 2

    我正在使用这个:4.times{|i|assert_not_equal("content#{i+2}".constantize,object.first_content)}我之前声明过局部变量content1content2content3content4content5我得到的错误NameError:wrongconstantnamecontent2这个错误是什么意思?我很确定我想要content2=\ 最佳答案 你必须用一个大字母来调用ruby​​常量:Content2而不是content2。Aconstantnamestart

  2. 建模分析 | 平面2R机器人(二连杆)运动学与动力学建模(附Matlab仿真) - 2

    目录0专栏介绍1平面2R机器人概述2运动学建模2.1正运动学模型2.2逆运动学模型2.3机器人运动学仿真3动力学建模3.1计算动能3.2势能计算与动力学方程3.3动力学仿真0专栏介绍?附C++/Python/Matlab全套代码?课程设计、毕业设计、创新竞赛必备!详细介绍全局规划(图搜索、采样法、智能算法等);局部规划(DWA、APF等);曲线优化(贝塞尔曲线、B样条曲线等)。?详情:图解自动驾驶中的运动规划(MotionPlanning),附几十种规划算法1平面2R机器人概述如图1所示为本文的研究本体——平面2R机器人。对参数进行如下定义:机器人广义坐标

  3. 网站日志分析软件--让网站日志分析工作变得更简单 - 2

    网站的日志分析,是seo优化不可忽视的一门功课,但网站越大,每天产生的日志就越大,大站一天都可以产生几个G的网站日志,如果光靠肉眼去分析,那可能看到猴年马月都看不完,因此借助网站日志分析工具去分析网站日志,那将会使网站日志分析工作变得更简单。下面推荐两款网站日志分析软件。第一款:逆火网站日志分析器逆火网站日志分析器是一款功能全面的网站服务器日志分析软件。通过分析网站的日志文件,不仅能够精准的知道网站的访问量、网站的访问来源,网站的广告点击,访客的地区统计,搜索引擎关键字查询等,还能够一次性分析多个网站的日志文件,让你轻松管理网站。逆火网站日志分析器下载地址:https://pan.baidu.

  4. ABB-IRB-1200运动学分析MATLAB RVC工具分析+Simulink-Adams联合仿真 - 2

    一、机器人介绍        此处是基于MATLABRVC工具箱,对ABB-IRB-1200型号的微型机械臂进行正逆向运动学分析,并利Simulink工具实现对机械臂进行具有动力学参数的末端轨迹规划仿真,最后根据机械模型设计Simulink-Adams联合仿真。 图1.ABBIRB 1200尺寸参数示意图ABBIRB 1200提供的两种型号广泛适用于各作业,且两者间零部件通用,两种型号的工作范围分别为700 mm 和 900 mm,大有效负载分别为 7 kg 和5 kg。 IRB 1200 能够在狭小空间内能发挥其工作范围与性能优势,具有全新的设计、小型化的体积、高效的性能、易于集成、便捷的接

  5. 关于Qt程序打包后运行库依赖的常见问题分析及解决方法 - 2

    目录一.大致如下常见问题:(1)找不到程序所依赖的Qt库version`Qt_5'notfound(requiredby(2)CouldnotLoadtheQtplatformplugin"xcb"in""eventhoughitwasfound(3)打包到在不同的linux系统下,或者打包到高版本的相同系统下,运行程序时,直接提示段错误即segmentationfault,或者Illegalinstruction(coredumped)非法指令(4)ldd应用程序或者库,查看运行所依赖的库时,直接报段错误二.问题逐个分析,得出解决方法:(1)找不到程序所依赖的Qt库version`Qt_5'

  6. ruby-on-rails - 如何使用 ruby​​-prof 和 JMeter 分析 Rails - 2

    我想使用ruby​​-prof和JMeter分析Rails应用程序。我对分析特定Controller/操作/或模型方法的建议方法不感兴趣,我想分析完整堆栈,从上到下。所以我运行这样的东西:RAILS_ENV=productionruby-prof-fprof.outscript/server>/dev/null然后我在上面运行我的JMeter测试计划。然而,问题是使用CTRL+C或SIGKILL中断它也会在ruby​​-prof可以写入任何输出之前杀死它。如何在不中断ruby​​-prof的情况下停止mongrel服务器? 最佳答案

  7. 【Unity游戏破解】外挂原理分析 - 2

    文章目录认识unity打包目录结构游戏逆向流程Unity游戏攻击面可被攻击原因mono的打包建议方案锁血飞天无限金币攻击力翻倍以上统称内存挂透视自瞄压枪瞬移内购破解Unity游戏防御开发时注意数据安全接入第三方反作弊系统外挂检测思路狠人自爆实战查看目录结构用il2cppdumper例子2-森林whoishe后记认识unity打包目录结构dll一般很大,因为里面是所有的游戏功能编译成的二进制码游戏逆向流程开发人员代码被编译打包到GameAssembly.dll中使用il2ppDumper工具,并借助游戏名_Data\il2cpp_data\Metadata\global-metadata.dat

  8. 驱动开发:内核无痕隐藏自身分析 - 2

    在笔者前面有一篇文章《驱动开发:断链隐藏驱动程序自身》通过摘除驱动的链表实现了断链隐藏自身的目的,但此方法恢复时会触发PG会蓝屏,偶然间在网上找到了一个作者介绍的一种方法,觉得有必要详细分析一下他是如何实现的进程隐藏的,总体来说作者的思路是最终寻找到MiProcessLoaderEntry的入口地址,该函数的作用是将驱动信息加入链表和移除链表,运用这个函数即可动态处理驱动的添加和移除问题。MiProcessLoaderEntry(pDriverObject->DriverSection,1)添加MiProcessLoaderEntry(pDriverObject->DriverSection,

  9. ruby-on-rails - 常量化通过浏览器提交的参数是不好的做法吗? - 2

    我有一个单表继承设置,我有一个Controller(我觉得有多个Controller会重复)。但是,对于某些方法,我想调用模型的子类。我想我可以让浏览器发送一个参数,我会针对该参数编写一个case语句。像这样的东西:case@model[:type]when"A"@results=Subclass1.search(params[:term])when"B"@results=Subclass2.search(params[:term])...end或者,我了解到Ruby的所有技巧都可以用字符串创建模型。像这样的东西:@results=params[:model].constantize.

  10. ruby-on-rails - 使用模型属性调用的 Brakeman 不安全反射方法常量化 - 2

    在我的Rails应用程序中,我收到来自brakeman的以下安全警告。使用模型属性调用的不安全反射方法常量化。这是我的代码正在执行的操作。chart_type=Chart.where(id:chart_id,).pluck(:type).firstbeginChartPresenter.new(chart_type.camelize.constantize.find(chart_id))rescueraise"Unabletofindthechartpresenter"end根据我的研究,我还没有找到任何具体的解决方案。我听说你可以创建一个白名单,但我不确定brakeman在寻找什么。

随机推荐