MySQL数据库信息统计表
前言
聊聊MySQL数据库中的统计信息,众所周知,MySQL在执行sql时,会使用统计信息进行判断,采用最优(cost花费最低)的执行计划,而这些统计信息是怎么进行的,在用户角度如何去调整或者理解统计信息呢?
本次演示使用的MySQL版本:8.0.12
innodb_table_stats 和 innodb_index_stats
innodb_table_stats和innodb_index_stats都是在MySQL实例中的mysql数据库下的表,分别存储表和索引级别的统计信息。
innodb_table_stats的字段说明:
Column name | Description | 说明 |
---|---|---|
database_name | Database name | 数据库名 |
table_name | Table name, partition name, or subpartition name | 表名 |
last_update | A timestamp indicating the last time that InnoDB updated this row | 最后更新时间 |
n_rows | The number of rows in the table | 表的行数(估值) |
clustered_index_size | The size of the primary index, in pages | 聚集索引页数 |
sum_of_other_index_sizes | The total size of other (non-primary) indexes, in pages | 二级索引页数 |
参考MySQL官方文档
innodb_table_stats示例:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
mysql> select * from mysql.innodb_table_stats where database_name='wstest';
+---------------+-------------+---------------------+----------+----------------------+--------------------------+
| database_name | table_name | last_update | n_rows | clustered_index_size | sum_of_other_index_sizes |
+---------------+-------------+---------------------+----------+----------------------+--------------------------+
| wstest | mul_replace | 2018-11-13 22:01:58 | 2 | 1 | 1 |
| wstest | rdslist | 2018-11-16 10:30:52 | 324 | 5 | 0 |
| wstest | t1 | 2019-03-12 14:13:33 | 8 | 1 | 1 |
| wstest | t2 | 2018-11-27 11:39:08 | 10 | 1 | 1 |
| wstest | t3 | 2018-12-12 14:52:23 | 302472 | 865 | 0 |
| wstest | t4 | 2018-12-12 15:07:34 | 4155228 | 21043 | 0 |
| wstest | t5 | 2019-03-05 19:15:28 | 11301308 | 58688 | 0 |
| wstest | test | 2018-10-30 16:29:22 | 6 | 1 | 0 |
| wstest | testgroupby | 2019-01-02 16:36:04 | 6 | 1 | 0 |
| wstest | tmstamp | 2019-01-07 17:02:43 | 2 | 1 | 0 |
+---------------+-------------+---------------------+----------+----------------------+--------------------------+
10 rows in set (0.00 sec)
从上述内容可以看出表的基本状态,包括表名,聚集索引和非聚集索引(二级索引)大小等等,此外,还能看出部分表在一段时间内没有更新,或者更新的行数没有超过recalculate的阈值(10% rows)
而想要获取到更为详细的信息,需要查看innodb_index_stats:
Column name | Description | 说明 |
---|---|---|
database_name | Database name | 数据库名 |
table_name | Table name, partition name, or subpartition name | 表名 |
index_name | Index name | 索引名 |
last_update | A timestamp indicating the last time that InnoDB updated this row | 最后更新时间 |
stat_name | The name of the statistic, whose value is reported in the stat_value column | 统计线程的名称 |
stat_value | The value of the statistic that is named in stat_name column | 统计线程的值 |
sample_size | The number of pages sampled for the estimate provided in the stat_value column | 采样页数 |
stat_description | Description of the statistic that is named in the stat_name column | 统计的说明 |
innodb_index_stats示例:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
<br></br>mysql> select * from mysql.innodb_index_stats where database_name='wstest' and table_name='t4';
+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+
| database_name | table_name | index_name | last_update | stat_name | stat_value | sample_size | stat_description |
+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+
| wstest | t4 | PRIMARY | 2019-03-21 18:57:41 | n_diff_pfx01 | 4155228 | 20 | id |
| wstest | t4 | PRIMARY | 2019-03-21 18:57:41 | n_leaf_pages | 20988 | NULL | Number of leaf pages in the index |
| wstest | t4 | PRIMARY | 2019-03-21 18:57:41 | size | 21043 | NULL | Number of pages in the index |
| wstest | t4 | idx_name | 2019-03-21 18:57:41 | n_diff_pfx01 | 1 | 2 | name |
| wstest | t4 | idx_name | 2019-03-21 18:57:41 | n_diff_pfx02 | 4189010 | 20 | name,id |
| wstest | t4 | idx_name | 2019-03-21 18:57:41 | n_leaf_pages | 5810 | NULL | Number of leaf pages in the index |
| wstest | t4 | idx_name | 2019-03-21 18:57:41 | size | 6699 | NULL | Number of pages in the index |
+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+
7 rows in set (0.00 sec)
首先,对于stat_name这列的解释:
– size:该索引总的页数
– n_leaf_pages:索引中叶子节点的页数
– n_diff_pfx01:前01个字段在索引中的唯一值数目(估值)
– n_diff_pfx02:前02个字段在索引中的唯一值数目(估值)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
mysql> select count(distinct name) from wstest.t4; ----n_diff_pfx01对应的value
+----------------------+
| count(distinct name) |
+----------------------+
| 2 |
+----------------------+
1 row in set (0.00 sec)
mysql> select count(distinct name,id) from wstest.t4; ----n_diff_pfx02对应的value
+-------------------------+
| count(distinct name,id) |
+-------------------------+
| 4194304 |
+-------------------------+
1 row in set (2.99 sec)
其他信息:
使用information_schema.tables查看统计信息:
1
2
3
4
5
6
7
8
mysql> select table_name,table_rows,data_length,index_length from information_schema.tables where table_name='t4';
+------------+------------+-------------+--------------+
| TABLE_NAME | TABLE_ROWS | DATA_LENGTH | INDEX_LENGTH |
+------------+------------+-------------+--------------+
| t4 | 4155228 | 344768512 | 0 |
+------------+------------+-------------+--------------+
1 row in set (0.07 sec)
发现INDEX_LENGTH 列为0,这是由于idx_name这列,是我为了展示二级索引时加上的,还没有能刷新在information_schema.tables里。此时,使用analyze table t4;进行统计信息的重新收集
(注:若是必须analyze,一定要放在低峰进行操作”)
1
2
3
4
5
6
7
8
mysql> analyze table wstest.t4;
+-----------+---------+----------+----------+
| Table | Op | Msg_type | Msg_text |
+-----------+---------+----------+----------+
| wstest.t4 | analyze | status | OK |
+-----------+---------+----------+----------+
1 row in set (0.11 sec)
重新收集统计信息后,查看information_schema.tables中的信息,与mysql.innodb_index_stats中的结果进行对比:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
mysql> SELECT SUM(stat_value) pages, index_name, SUM(stat_value)*@@innodb_page_size size FROM mysql.innodb_index_stats WHERE table_name='t4' AND stat_name = 'size' GROUP BY index_name;
+-------+------------+-----------+
| pages | index_name | size |
+-------+------------+-----------+
| 21108 | PRIMARY | 345833472 |
| 6699 | idx_name | 109756416 |
+-------+------------+-----------+
2 rows in set (0.00 sec)
mysql> select table_name,table_rows,data_length,index_length from information_schema.tables where table_name='t4';
+------------+------------+-------------+--------------+
| TABLE_NAME | TABLE_ROWS | DATA_LENGTH | INDEX_LENGTH |
+------------+------------+-------------+--------------+
| t4 | 4173444 | 345833472 | 109756416 |
+------------+------------+-------------+--------------+
1 row in set (0.00 sec)
可以看出:
1. information_schema.tables中的DATA_LENGTH 值 = PRIMARY页数 * 页的大小,即对于InnoDB,聚集索引的大小即为数据大小。
- information_schema.tables中的INDEX_LENGTH值 = 二级索引页数 * 页的大小
再在t4上添加一个索引:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
mysql> analyze table wstest.t4; +-----------+---------+----------+----------+
| Table | Op | Msg_type | Msg_text |
+-----------+---------+----------+----------+
| wstest.t4 | analyze | status | OK |
+-----------+---------+----------+----------+
1 row in set (0.06 sec)
mysql> select table_name,table_rows,data_length,index_length from information_schema.tables where table_name='t4';
+------------+------------+-------------+--------------+
| TABLE_NAME | TABLE_ROWS | DATA_LENGTH | INDEX_LENGTH |
+------------+------------+-------------+--------------+
| t4 | 4173444 | 345833472 | 356155392 |
+------------+------------+-------------+--------------+
1 row in set (0.00 sec)
mysql> SELECT SUM(stat_value) pages, index_name, SUM(stat_value)*@@innodb_page_size size FROM mysql.innodb_index_stats WHERE table_name='t4' AND stat_name = 'size' GROUP BY index_name;
+-------+------------+-----------+
| pages | index_name | size |
+-------+------------+-----------+
| 21108 | PRIMARY | 345833472 |
| 15039 | idx_addr | 246398976 |
| 6699 | idx_name | 109756416 |
+-------+------------+-----------+
3 rows in set (0.01 sec)
可以看出:INDEX_LENGTH = size( idx_addr + idx_name )
相关配置信息
为了使统计信息更加准确,需要关注:
innodb_stats_transient_sample_pages(innodb_stats_persistent=OFF时):在收集统计信息时的采样索引pages页数,默认为8。
innodb_stats_persistent_sample_pages(innodb_stats_persistent=ON时):在收集统计信息时的采样索引pages页数,默认为20。
innodb_stats_persistent:是否将InnoDB的索引统计信息同步到磁盘,若设置为0,则在产生变更时,会自动重新收集信息,得到不同的执行计划,默认为ON。
innodb_stats_auto_recalc(innodb_stats_persistent=ON时):在表的数据被更改后(阈值:10%),自动地重新估算统计信息