MySQL综合索引 [英] MySql composite index

查看:64
本文介绍了MySQL综合索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们正在使用MySql作为数据库

We are using MySql as our DB

以下查询在mysql表上运行(约2500万条记录).我在这里粘贴了两个查询.查询运行太慢,我想知道更好的复合索引是否可以改善这种情况.

The following query is runs on mysql table(approx 25million records). I pasted two queries here.The queries runs too slowly and I was wondering if better composite indexes might improve the situation.

关于最佳综合指数是什么的任何想法?

Any idea on what the best composite index would be?

并建议我这些查询是否需要复合索引

and Suggest me Is composite index required for these queries

第一查询

    EXPLAIN SELECT log_type,
       count(DISTINCT subscriber_id) AS distinct_count,
       count(*) as total_count
FROM stats.campaign_logs
WHERE domain = 'xxx'
  AND campaign_id='12345'
  AND log_type IN ('EMAIL_SENT', 'EMAIL_CLICKED', 'EMAIL_OPENED', 'UNSUBSCRIBED')
  AND log_time BETWEEN CONVERT_TZ('2015-02-12 00:00:00','+05:30','+00:00')
                   AND CONVERT_TZ('2015-02-19 23:59:58','+05:30','+00:00')
GROUP BY log_type

上述查询的解释

+----+-------------+---------------+-------------+--------------------------------------------------------------+--------------------------------+---------+------+-------+------------------------------------------------------------------------------+
| id | select_type | table         | type        | possible_keys                                                | key                            | key_len | ref  | rows  | Extra                                                                        |
+----+-------------+---------------+-------------+--------------------------------------------------------------+--------------------------------+---------+------+-------+------------------------------------------------------------------------------+
|  1 | SIMPLE      | campaign_logs | index_merge | campaign_id_index,domain_index,log_type_index,log_time_index | campaign_id_index,domain_index | 153,153 | NULL | 35683 | Using intersect(campaign_id_index,domain_index); Using where; Using filesort |
+----+-------------+---------------+-------------+--------------------------------------------------------------+--------------------------------+---------+------+-------+------------------------------------------------------------------------------+

第二查询

SELECT campaign_id
     , subscriber_id
     , campaign_name
     , log_time
     , log_type
     , message
     , UNIX_TIMESTAMP(log_time) AS time 
  FROM campaign_logs 
 WHERE domain = 'xxx'  
   AND log_type = 'EMAIL_OPENED'  
 ORDER  
    BY log_time DESC 
 LIMIT 20;

上述查询的解释

+----+-------------+---------------+-------------+-----------------------------+-----------------------------+---------+------+--------+---------------------------------------------------------------------------+
| id | select_type | table         | type        | possible_keys               | key                         | key_len | ref  | rows   | Extra                                                                     |
+----+-------------+---------------+-------------+-----------------------------+-----------------------------+---------+------+--------+---------------------------------------------------------------------------+
|  1 | SIMPLE      | campaign_logs | index_merge | domain_index,log_type_index | domain_index,log_type_index | 153,153 | NULL | 118392 | Using intersect(domain_index,log_type_index); Using where; Using filesort |
+----+-------------+---------------+-------------+-----------------------------+-----------------------------+---------+------+--------+---------------------------------------------------------------------------+

第三查询

EXPLAIN SELECT *, UNIX_TIMESTAMP(log_time) AS time FROM stats.campaign_logs WHERE domain = 'xxx' AND log_type <> 'EMAIL_SLEEP' AND  subscriber_id = '123' ORDER BY log_time DESC LIMIT 100

上述查询的解释

+----+-------------+---------------+------+-------------------------------------------------+---------------------+---------+-------+------+-----------------------------+
| id | select_type | table         | type | possible_keys                                   | key                 | key_len | ref   | rows | Extra                       |
+----+-------------+---------------+------+-------------------------------------------------+---------------------+---------+-------+------+-----------------------------+
|  1 | SIMPLE      | campaign_logs | ref  | subscriber_id_index,domain_index,log_type_index | subscriber_id_index | 153     | const |   35 | Using where; Using filesort |
+----+-------------+---------------+------+-------------------------------------------------+---------------------+---------+-------+------+-----------------------------+

如果您需要其他详细信息,我可以在此处提供

If you want any other details I can provide here

更新(2016年4月22日): 现在,我们想在现有表中再添加一列,即节点ID.一个广告活动可以有多个节点.无论我们在广告系列中生成什么报告,我们现在也都需要在各个节点上的那些报告.

UPDATE (2016/April/22) : Now we want to add one more column into existing table that is node id. One campaign can have multiple nodes. Whatever reports we are generating on campaigns we need those reports on individual nodes also now.

例如

SELECT log_type,
           count(DISTINCT subscriber_id) AS distinct_count,
           count(*) as total_count
    FROM stats.campaign_logs
    WHERE domain = 'xxx',
      AND campaign_id='12345',
      AND node_id = '34567',
      AND log_type IN ('EMAIL_SENT', 'EMAIL_CLICKED', 'EMAIL_OPENED', 'UNSUBSCRIBED')
      AND log_time BETWEEN CONVERT_TZ('2015-02-12 00:00:00','+05:30','+00:00')
                       AND CONVERT_TZ('2015-02-19 23:59:58','+05:30','+00:00')
    GROUP BY log_type

CREATE TABLE `camp_logs` (
  `domain` varchar(50) DEFAULT NULL,
  `campaign_id` varchar(50) DEFAULT NULL,
  `subscriber_id` varchar(50) DEFAULT NULL,
  `message` varchar(21000) DEFAULT NULL,
  `log_time` datetime DEFAULT NULL,
  `log_type` varchar(50) DEFAULT NULL,
  `level` varchar(50) DEFAULT NULL,
  `campaign_name` varchar(500) DEFAULT NULL,
  KEY `subscriber_id_index` (`subscriber_id`),
  KEY `log_type_index` (`log_type`),
  KEY `log_time_index` (`log_time`),
  KEY `campid_domain_logtype_logtime_subid_index` (`campaign_id`,`domain`,`log_type`,`log_time`,`subscriber_id`),
  KEY `domain_logtype_logtime_index` (`domain`,`log_type`,`log_time`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |

SIZE问题.

由于我们有两个复合索引,因此索引文件迅速增加.以下是该表格的当前统计信息. 资料大小:30 GB 索引大小:35 GB

As we have two composite indexes , index file incresing rapidly. following are the table current stats. Data size : 30 GB Index size: 35 GB

对于有关node_id的报告,我们要更新现有的综合索引

for reports on node_id we want to update our existing composite index

来自

KEY `campid_domain_logtype_logtime_subid_index` (`campaign_id`,`domain`,`log_type`,`log_time`,`subscriber_id`),

KEY `campid_domain_logtype_logtime_subid_nodeid_index` (`campaign_id`,`domain`,`log_type`,`log_time`,`subscriber_id`,`node_id`)

您能为广告系列和节点级别的报告建议合适的组合索引吗?

Could you suggest suitable composite indexes for both campaign and node level reports.

谢谢

推荐答案

这是您的第一个查询:

SELECT A.log_type, count(*) as distinct_count, sum(A.total_count) as total_count
from (SELECT log_type, count(subscriber_id) as total_count
      FROM stats.campaign_logs
      WHERE domain = 'xxx' AND campaign_id = '12345' AND
            log_type IN ('EMAIL_SENT', 'EMAIL_CLICKED', 'EMAIL_OPENED', 'UNSUBSCRIBED') AND
             DATE(CONVERT_TZ(log_time,'+00:00','+05:30')) BETWEEN DATE('2015-02-12 00:00:00') AND DATE('2015-02-19 23:59:58')
      GROUP BY subscriber_id,log_type) A
GROUP BY A.log_type;

最好写成:

      SELECT log_type, count(DISTINCT subscriber_id) as total_count
      FROM stats.campaign_logs
      WHERE domain = 'xxx' AND campaign_id = '12345' AND
            log_type IN ('EMAIL_SENT', 'EMAIL_CLICKED', 'EMAIL_OPENED', 'UNSUBSCRIBED') AND
             DATE(CONVERT_TZ(log_time, '+00:00', '+05:30')) BETWEEN DATE('2015-02-12 00:00:00') AND DATE('2015-02-19 23:59:58')
      GROUP BY log_type;

关于此的最佳索引可能是:campaign_logs(domain, campaign_id, log_type, log_time, subscriber_id).这是查询的覆盖索引.前三个键应用于where过滤.

The best index on this is probably: campaign_logs(domain, campaign_id, log_type, log_time, subscriber_id). This is a covering index for the query. The first three keys should be used for the where filtering.

这篇关于MySQL综合索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆