优化比较 MySQL 两个大表中的数据 [英] Optimise comparing data in two big MySQL tables

查看:69
本文介绍了优化比较 MySQL 两个大表中的数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我如何优化查询,它将找到所有记录,其中:

How could I optimise query, which will find all records, which:

  • 有activation_request.date_confirmed 不为空

  • 在另一个表中没有相关的字符串值:activation_request.email =user.username 不应返回任何记录

我试过了:

SELECT  email 
FROM activation_request l 
    LEFT JOIN user r ON r.username = l.email 
WHERE l.date_confirmed is not null 
AND r.username IS NULL

SELECT email 
FROM  activation_request 
WHERE  date_confirmed is not null 
AND NOT EXISTS (SELECT 1 
                FROM user  
                WHERE  user.username = activation_request.email
                )

但是两个表都有 xxx.xxx.xxx 记录,因此在运行这些查询一整夜之后不幸的是我没有得到任何结果.

but both tables have xxx.xxx.xxx records hence after all night running those queries unfortunatelly I haven't got any results.

创建语句:

CREATE TABLE `activation_request` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `version` bigint(20) NOT NULL,
  `date_confirmed` datetime DEFAULT NULL,
  `email` varchar(255) NOT NULL,
  (...)
  PRIMARY KEY (`id`),
  KEY `emailIdx` (`email`),
  KEY `reminderSentIdx` (`date_reminder_sent`),
  KEY `idx_resent_needed` (`date_reminder_sent`,`date_confirmed`),
) ENGINE=InnoDB AUTO_INCREMENT=103011867 DEFAULT CHARSET=utf8;




CREATE TABLE `user` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `version` bigint(20) NOT NULL,
  `username` varchar(255) NOT NULL,
  (...)
  PRIMARY KEY (`id`),
  UNIQUE KEY `Q52plW9W7TJWZcLj00K3FmuhwMSw4F7vmxJGyjxz5iiINVR9fXyacEoq4rHppb` (`username`),
) ENGINE=InnoDB AUTO_INCREMENT=431400048 DEFAULT CHARSET=latin1;

LEFT JOIN 的说明:

Explain for LEFT JOIN:

[[id:1, select_type:SIMPLE, table:l, type:ALL, possible_keys:null,key:null, key_len:null, ref:null, rows:49148965, Extra:Using where],[id:1, select_type:SIMPLE, table:r, type:index, possible_keys:null,密钥:Q52plW9W7TJWZcLj00K3FmuhwMSw4F7vmxJGyjxz5iiINVR9fXyacEoq4rHppb,key_len:257, ref:null, rows:266045508, Extra:Using where;不存在;使用索引;使用连接缓冲区(块嵌套循环)]] [[id:1,选择类型:简单,表:l,类型:全部,可能的键:空,键:空,key_len:null, ref:null, rows:49148965, Extra:Using where], [id:1,选择类型:简单,表:r,类型:索引,可能的键:空,密钥:Q52plW9W7TJWZcLj00K3FmuhwMSw4F7vmxJGyjxz5iiINVR9fXyacEoq4rHppb,key_len:257, ref:null, rows:266045508, Extra:Using where;不存在;使用索引;使用连接缓冲区(块嵌套循环)]]

[[id:1, select_type:SIMPLE, table:l, type:ALL, possible_keys:null, key:null, key_len:null, ref:null, rows:49148965, Extra:Using where], [id:1, select_type:SIMPLE, table:r, type:index, possible_keys:null, key:Q52plW9W7TJWZcLj00K3FmuhwMSw4F7vmxJGyjxz5iiINVR9fXyacEoq4rHppb, key_len:257, ref:null, rows:266045508, Extra:Using where; Not exists; Using index; Using join buffer (Block Nested Loop)]] [[id:1, select_type:SIMPLE, table:l, type:ALL, possible_keys:null, key:null, key_len:null, ref:null, rows:49148965, Extra:Using where], [id:1, select_type:SIMPLE, table:r, type:index, possible_keys:null, key:Q52plW9W7TJWZcLj00K3FmuhwMSw4F7vmxJGyjxz5iiINVR9fXyacEoq4rHppb, key_len:257, ref:null, rows:266045508, Extra:Using where; Not exists; Using index; Using join buffer (Block Nested Loop)]]

在暂存数据库上添加索引(数据略少,但结构相同)后,查询现在运行约 24 小时,但仍然没有结果):

After adding indexes on staging db (with slightly less data, but the same structure) query is now running ~24h and still no results):

$ show processlist;

| Id | User    | Host                                            | db       | Command | Time   | State        | Info 
| 64 | root    | localhost                                       | staging_db   | Query   | 110072 | Sending data | SELECT ar.email FROM  activation_request ar WHERE ar.date_confirmed is not null AND NOT EXISTS (SELE |

Mysql 版本:

$ select version();
5.6.16-1~exp1

列表中的所有其他命令都是 Sleep,因此没有其他查询在运行,也没有可能干扰/锁定行.

All other commands on the list are Sleep so there is no other query running and possibly disturbing/locking rows.

推荐答案

对于这个查询:

SELECT ar.email 
FROM  activation_request ar
WHERE ar.date_confirmed is not null AND
      NOT EXISTS (SELECT 1 
                  FROM user u
                  WHERE u.username = ar.email
                 )

我会推荐 activation_request(date_confirmed, email)user(username) 的索引.

I would recommend indexes on activation_request(date_confirmed, email) and user(username).

除非你有大量的数据,否则你的问题可能是表被锁定了.

Unless you have a really humongous amount of data, though, your problem may be that tables are locked.

这篇关于优化比较 MySQL 两个大表中的数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆