MySQL表索引基数 [英] MySQL Table Index Cardinality

查看:100
本文介绍了MySQL表索引基数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个很大的MySQL数据库表,有250万行并且还在增长。为了加快查询速度,我在其中一列中添加了索引。例如,当我通过PHPMyAdmin手动设置索引时,基数约为1500,这似乎是正确的,并且我的查询运行没有问题。



然后是在运行了几个查询(尤其是在INSERT上,但不仅限于)后,该问题的基数下降到17或18,而且查询运行非常慢。有时它似乎可以恢复到大约1500,或者我必须再次通过PHPMyAdmin进行操作。



有什么办法可以阻止这种基数下降的发生? p>

 如果不存在`probe_results`(
`probe_result_id` int(11)NOT NULL AUTO_INCREMENT,
`date`日期非空,
`month`int(11)非空,
`year`int(11)非空,
`time`时间非空,
`type` varchar(11)NOT NULL,
`probe_id` varchar(50)NOT NULL,
`status` varchar(11)NOT NULL,
`temp_1`十进制(11,0 )NOT NULL,
`temp_2`十进制(11,0)NOT NULL,
`crc` varchar(11)NOT NULL,
`raw_data`文本NOT NULL,
` txt_file`文本NOT NULL,
PRIMARY KEY(`probe_result_id`),
KEY`probe_id`(`probe_id`)
)ENGINE = InnoDB DEFAULT CHARSET = latin1 AUTO_INCREMENT = 2527300;

probe_result_id列是主键,probe_id是包含相关索引的列。



示例查询:

 选择IF(b.reactive_total IS NULL ,0,b.reactive_total)作为反应性总计,a。* FROM(选择计数(当asset_testing_results.asset_testing_year ='2016'AND asset_testing_results.asset_testing_month ='7'AND asset_testing_results.asset_stopped ='0'AND资产_testingdres = 'THEN 1 END)AS due_total,(COUNT(CASE WHEN asset_testing_results.asset_testing_year = '2016' AND asset_testing_results.asset_stopped = '0' 和asset_testing_results.asset_testing_completed = '1' AND asset_testing_results.asset_testing_satisfactory = '1' AND asset_testing_results.asset_testing_actioned =' 0'THEN 1 END)+(IF(probes_passed_total IS NULL,0,probes_passed_total)))AS合格总数(COUNT(CASE when when asset_testing_results.asset_testing_year ='2016'AND asset_testing_results .asset_stopped ='0'AND asset_testing_results.asset_testing_completed ='1'AND asset_testing_results.asset_testing_satisfactory ='0'AND asset_testing_results.asset_testing_actioned ='0'THEN 1 END)+(IF(probes_sailedto),失败为0 ,COUNT(当asset_testing_results.asset_testing_year ='2016'AND asset_testing_results.asset_stopped ='0'并且asset_testing_results.asset_testing_completed ='1'并且asset_testing_results.asset_testing_actioned ='1'THEN_ASE_ASE,ASE_REST_CASE_REST_RES_CASE_REST_RES_CASE_REST_RES_CASE_REST_RES_CASE_REST ='2016'AND asset_testing_results.asset_testing_month< '7'AND asset_testing_results.asset_testing_completed ='0'AND asset_testing_results.asset_testing_satisfactory ='0'AND asset_testing_results.asset_stopped ='0'THEN 1 END)AS missed_total,site.site_key,site.site_name FROM site LEFT = site.site_key LEFT JOIN子位置ON sub_location.location_key = location.location_key LEFT JOIN资产ON asset.sub_location_key = sub_location.sub_location_key AND asset.stopped ='0'LEFT JOIN asset_testing开启asset_testing.asset_type_key = asset.asset_type_key AND asset_testeds.probe '0'LEFT JOIN asset_testing_results ON asset_testing_results.asset_testing_key = asset_testing.asset_testing_key AND asset_testing_results.asset_key = asset.asset_key LEFT JOIN(选择site.site_key,COUNT(在p.probe_id为1时不为空的情况)AS probes_passed_total,COUNT(当p.probe_id不为空且p.asset_testing_key为NULL而p.temp_1不为空则为1结束)AS probes_failed_total FROM资产vs_probes左联接(选择q.probe_id,q.month,q.year,IF(r.temp_1为NULL,q.temp_1,r.temp_1)作为temp_1,r.asset_testing_key FROM(选择DISTINCT probe_results.probe_id,probe_results .month,probe_results.year,probe_results.temp_1来自probe_results LEFT JOIN资产vs_probes ON资产vs_probes.probe_id = probe_results.probe_id LEFT JOIN资产ON asset.asset_key =资产vs_probes.asset_key LEFT JOIN子位置ON_location_sub_location.sub_location。sub_location。 .location_key = sub_location.location_key LEFT JOIN网站在site.site_key = location.site_key WHERE site.client_key ='25')q LEFT JOIN(选择probe_results.month,probe_results.year,probe_results.probe_id,temp_1,asset_testing.asset_testing_key FROM probe_results LEFT JOIN资产vs_probes开启assetvs_probes.probe_id = probe_results.probe_id LEFT JOIN资产测试开启asset_testing.asset_testing_key =资产vs_probes.asset_testing_key LEFT JOIN资产开启asset.asset_key = assetvs_probes.asset_key LEFT JOIN子位置ON sub_location.sub_location_key = asset.sub_location_key LEFT JOIN location ON location.location_key = sub_location.location_key LEFT JOIN site ON site.site_key = location.site_key WHERE temp_1!='无效'AND(( temp_1> = test_min AND test_max ='')OR(temp_1< = test_max AND test_min ='')OR(temp_1> = test_min AND temp_1< = test_max))AND year ='2016'AND site.client_key = '25'GROUP BY probe_results.month,probe_results.year,probe_results.probe_id)r ON r.probe_id = q.probe_id AND r.month = q.month AND r.year = q.year WHERE q.year ='2016' GROUP BY probe_id,月,年)p ON p.probe_id =资产vs_probes.probe_id LEFT JOIN资产_testing on asset_testing.asset_testing_key =资产vs_probes.asset_testing_key LEFT JOIN资产ON asset.asset_key =资产vs_probes.asset_key LEFT JOIN子位置=子位置=资产左联接location ON location.location_key = sub_loca tion.location_key LEFT JOIN网站位于site.site_key = location.site_key GROUP BY site.site_key)probe_results ON probe_results.site_key = site.site_key WHERE site.client_key ='25'GROUP BY site.site_key)一个左侧联接(SELECT COUNT(当jobs.status ='3'的情况下,然后1结束)作为REactive_total,site.site_key,来自Jobs左联接Jobs_meta ON Jobs_meta.job_id = jobs.job_id和Jobs_meta.meta_key ='start_date'左联接site on site.site_key = Jobs。 site_key WHERE site.client_key ='25'AND jobs_meta.meta_value LIKE'%/ 2016%'GROUP BY site.site_key)b ON b.site_key = a.site_key 

谢谢

解决方案

基数(以及其他统计信息)被计算并由MySQL自动更新,因此您没有直接的方法来防止它丢失。



但是,您可以采取一些措施来减少这种情况的发生或纠正



首先,MySQL更新所有支持的索引统计信息如果您运行分析表命令,则会使用rted表引擎。 / p>

对于innodb表引擎,MySQL提供了一组可能影响采样行为的配置设置。设置及其作用在MySQL文档中进行了描述:





主要设置是innodb_stats_transient_sample_pages:


•较小的值(例如1或2)可能会导致$ b $的估计不准确b基数。



•增加innodb_stats_transient_sample_pages值可能需要
读取更多磁盘。值远大于8(例如100),可能会导致
显着降低打开表的时间或
执行SHOW TABLE STATUS的时间。



•优化器可能基于
索引选择性的不同估计来选择非常不同的查询计划




对于myisam,MySQL不提供这么多种设置。 myisam_stats_method设置在常规索引统计文档中进行了介绍


I have a MySQL database table that's quite large with 2.5million rows and growing. To speed up queries I've added an index to one of columns. When I set the index manually, through PHPMyAdmin for example, it's cardinality is around 1500 which seems about right and my queries run without issue.

The problem then comes after a few queries, (especially on an INSERT but not limited to), have been run, the cardinality of that index drops to 17 or 18, and queries run extremely slow. Sometimes it seems to work it's way back to about 1500 or I have to do it through PHPMyAdmin again.

Is there any way to stop this cardinality drop from happening?

CREATE TABLE IF NOT EXISTS `probe_results` (
  `probe_result_id` int(11) NOT NULL AUTO_INCREMENT,
  `date` date NOT NULL,
  `month` int(11) NOT NULL,
  `year` int(11) NOT NULL,
  `time` time NOT NULL,
  `type` varchar(11) NOT NULL,
  `probe_id` varchar(50) NOT NULL,
  `status` varchar(11) NOT NULL,
  `temp_1` decimal(11,0) NOT NULL,
  `temp_2` decimal(11,0) NOT NULL,
  `crc` varchar(11) NOT NULL,
  `raw_data` text NOT NULL,
  `txt_file` text NOT NULL,
  PRIMARY KEY (`probe_result_id`),
  KEY `probe_id` (`probe_id`)
) ENGINE=InnoDB  DEFAULT CHARSET=latin1 AUTO_INCREMENT=2527300 ;

The 'probe_result_id' column is the primary key, the probe_id is the column with the index in question.

Example query:

SELECT IF(b.reactive_total IS NULL, 0, b.reactive_total) AS reactive_total, a.* FROM (SELECT COUNT(CASE WHEN asset_testing_results.asset_testing_year = '2016' AND asset_testing_results.asset_testing_month = '7' AND asset_testing_results.asset_stopped = '0' AND asset_testing_results.asset_testing_completed = '0' THEN 1 END) AS due_total, (COUNT(CASE WHEN asset_testing_results.asset_testing_year = '2016' AND asset_testing_results.asset_stopped = '0' AND asset_testing_results.asset_testing_completed = '1' AND asset_testing_results.asset_testing_satisfactory = '1' AND asset_testing_results.asset_testing_actioned = '0' THEN 1 END)+(IF(probes_passed_total IS NULL, 0, probes_passed_total))) AS passed_total, (COUNT(CASE WHEN asset_testing_results.asset_testing_year = '2016' AND asset_testing_results.asset_stopped = '0' AND asset_testing_results.asset_testing_completed = '1' AND asset_testing_results.asset_testing_satisfactory = '0' AND asset_testing_results.asset_testing_actioned = '0' THEN 1 END)+(IF(probes_failed_total IS NULL, 0, probes_failed_total))) AS failed_total, COUNT(CASE WHEN asset_testing_results.asset_testing_year = '2016' AND asset_testing_results.asset_stopped = '0' AND asset_testing_results.asset_testing_completed = '1' AND asset_testing_results.asset_testing_actioned = '1' THEN 1 END) AS actioned_total, COUNT(CASE WHEN asset_testing_results.asset_testing_year = '2016' AND asset_testing_results.asset_testing_month < '7' AND asset_testing_results.asset_testing_completed = '0' AND asset_testing_results.asset_testing_satisfactory = '0' AND asset_testing_results.asset_stopped = '0' THEN 1 END) AS missed_total, site.site_key, site.site_name FROM site LEFT JOIN location ON location.site_key = site.site_key LEFT JOIN sub_location ON sub_location.location_key = location.location_key LEFT JOIN asset ON asset.sub_location_key = sub_location.sub_location_key AND asset.stopped = '0' LEFT JOIN asset_testing ON asset_testing.asset_type_key = asset.asset_type_key AND asset_testing.probe_assessed = '0' LEFT JOIN asset_testing_results ON asset_testing_results.asset_testing_key = asset_testing.asset_testing_key AND asset_testing_results.asset_key = asset.asset_key LEFT JOIN (SELECT site.site_key, COUNT(CASE WHEN p.probe_id IS NOT NULL AND p.asset_testing_key IS NOT NULL THEN 1 END) AS probes_passed_total, COUNT(CASE WHEN p.probe_id IS NOT NULL AND p.asset_testing_key IS NULL AND p.temp_1 IS NOT NULL THEN 1 END) AS probes_failed_total FROM assetsvs_probes LEFT JOIN (SELECT q.probe_id, q.month, q.year, IF(r.temp_1 IS NULL, q.temp_1, r.temp_1) as temp_1, r.asset_testing_key FROM (SELECT DISTINCT probe_results.probe_id, probe_results.month, probe_results.year, probe_results.temp_1 FROM probe_results LEFT JOIN assetsvs_probes ON assetsvs_probes.probe_id = probe_results.probe_id LEFT JOIN asset ON asset.asset_key = assetsvs_probes.asset_key LEFT JOIN sub_location ON sub_location.sub_location_key = asset.sub_location_key LEFT JOIN location ON location.location_key = sub_location.location_key LEFT JOIN site ON site.site_key = location.site_key WHERE site.client_key = '25')q LEFT JOIN (SELECT probe_results.month, probe_results.year, probe_results.probe_id, temp_1, asset_testing.asset_testing_key FROM probe_results LEFT JOIN assetsvs_probes ON assetsvs_probes.probe_id = probe_results.probe_id LEFT JOIN asset_testing ON asset_testing.asset_testing_key = assetsvs_probes.asset_testing_key LEFT JOIN asset ON asset.asset_key = assetsvs_probes.asset_key LEFT JOIN sub_location ON sub_location.sub_location_key = asset.sub_location_key LEFT JOIN location ON location.location_key = sub_location.location_key LEFT JOIN site ON site.site_key = location.site_key WHERE temp_1 != 'invalid' AND ((temp_1 >= test_min AND test_max = '') OR (temp_1 <= test_max AND test_min = '') OR (temp_1 >= test_min AND temp_1 <= test_max)) AND year = '2016' AND site.client_key = '25' GROUP BY probe_results.month, probe_results.year, probe_results.probe_id)r ON r.probe_id = q.probe_id AND r.month = q.month AND r.year = q.year WHERE q.year = '2016' GROUP BY probe_id, month, year) p ON p.probe_id = assetsvs_probes.probe_id LEFT JOIN asset_testing ON asset_testing.asset_testing_key = assetsvs_probes.asset_testing_key LEFT JOIN asset ON asset.asset_key = assetsvs_probes.asset_key LEFT JOIN sub_location ON sub_location.sub_location_key = asset.sub_location_key LEFT JOIN location ON location.location_key = sub_location.location_key LEFT JOIN site ON site.site_key = location.site_key GROUP BY site.site_key) probe_results ON probe_results.site_key = site.site_key WHERE site.client_key = '25' GROUP BY site.site_key)a LEFT JOIN (SELECT COUNT(CASE WHEN jobs.status = '3' THEN 1 END) AS reactive_total, site.site_key FROM jobs LEFT JOIN jobs_meta ON jobs_meta.job_id = jobs.job_id AND jobs_meta.meta_key = 'start_date' LEFT JOIN site ON site.site_key = jobs.site_key WHERE site.client_key = '25' AND jobs_meta.meta_value LIKE '%/2016 %' GROUP BY site.site_key)b ON b.site_key = a.site_key

Thanks

解决方案

Cardinality (along with other statistics) is calculated and updated by MySQL automatically, so you do not have direct means to prevent it from dropping.

However, you can take a few steps to make this less likely to happen or correct the behaviour.

First of all, MySQL updates index statistics for all supported table engines if you run analyze table command.

For innodb table engine MySQL provides a set of configuration settings that can influence the behaviour of the sampling. The settings and their effect are described in the MySQL documentation:

The main setting is innodb_stats_transient_sample_pages:

• Small values like 1 or 2 can result in inaccurate estimates of cardinality.

• Increasing the innodb_stats_transient_sample_pages value might require more disk reads. Values much larger than 8 (say, 100), can cause a significant slowdown in the time it takes to open a table or execute SHOW TABLE STATUS.

• The optimizer might choose very different query plans based on different estimates of index selectivity

.

For myisam MySQL dos not provide such a variety of settings. myisam_stats_method setting is described in the general index statistics documentation

这篇关于MySQL表索引基数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆