如何使索引良好的MySQL表有效连接 [英] How to make well indexed MySQL tables join effectively

查看:98
本文介绍了如何使索引良好的MySQL表有效连接的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是第一个表'tbl1':

Here is the first table 'tbl1':

+---------+---------------------+------+-----+---------+----------------+
| Field   | Type                | Null | Key | Default | Extra          |
+---------+---------------------+------+-----+---------+----------------+
| val     | varchar(45)         | YES  | MUL | NULL    |                |
| id      | bigint(20) unsigned | NO   | PRI | NULL    | auto_increment |
+---------+---------------------+------+-----+---------+----------------+

及其索引:

+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| tbl1  |          0 | PRIMARY  |            1 | id          | A         |   201826018 |     NULL | NULL   |      | BTREE      |         |
| tbl1  |          1 | val      |            1 | val         | A         |     2147085 |     NULL | NULL   | YES  | BTREE      |         |
| tbl1  |          1 | id_val   |            1 | id          | A         |   201826018 |     NULL | NULL   |      | BTREE      |         |
| tbl1  |          1 | id_val   |            2 | val         | A         |   201826018 |     NULL | NULL   | YES  | BTREE      |         |
| tbl1  |          1 | val_id   |            1 | val         | A         |     2147085 |     NULL | NULL   | YES  | BTREE      |         |
| tbl1  |          1 | val_id   |            2 | id          | A         |   201826018 |     NULL | NULL   |      | BTREE      |         |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+

(进行一些额外索引的原因是: http://bit.ly/KWx1Xz .)

(The reason for some extra indexing is this: http://bit.ly/KWx1Xz.)

第二张表几乎相同.不过,这是其索引基数:

The second table is just about the same. Here are its index cardinalities though:

+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table  | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| tbl2   |          0 | PRIMARY  |            1 | id          | A         |   201826018 |     NULL | NULL   |      | BTREE      |         |
| tbl2   |          1 | val      |            1 | val         | A         |      881336 |     NULL | NULL   | YES  | BTREE      |         |
| tbl2   |          1 | id_val   |            1 | id          | A         |   201826018 |     NULL | NULL   |      | BTREE      |         |
| tbl2   |          1 | id_val   |            2 | val         | A         |   201826018 |     NULL | NULL   | YES  | BTREE      |         |
| tbl2   |          1 | val_id   |            1 | val         | A         |      881336 |     NULL | NULL   | YES  | BTREE      |         |
| tbl2   |          1 | val_id   |            2 | id          | A         |   201826018 |     NULL | NULL   |      | BTREE      |         |
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+

任务是在val列上将它们进行内部联接,并获取ID的列表(并在1秒钟内完成).

The task is to inner-join them on the val column and get the id's list (and to do it in 1 second).

这是加入"方法:

SELECT tbl1.id FROM tbl1 JOIN tbl2 ON tbl1.val = 'iii' AND tbl2.val = 'iii' AND tbl1.id = tbl2.id;

结果:设置10831行( 55.15秒)

Result: 10831 rows in set (55.15 sec)

查询说明:

+----+-------------+--------+--------+----------------------------------+---------+---------+---------------------------+------+--------------------------+
| id | select_type | table  | type   | possible_keys                    | key     | key_len | ref                       | rows | Extra                    |
+----+-------------+--------+--------+----------------------------------+---------+---------+---------------------------+------+--------------------------+
|  1 | SIMPLE      | tbl1   | ref    | PRIMARY,val,id_val,val_id        | val_id  | 138     | const                     | 5160 | Using where; Using index |
|  1 | SIMPLE      | tbl2   | eq_ref | PRIMARY,val,id_val,val_id        | PRIMARY | 8       | search_test.tbl1.id       | 1    | Using where              |
+----+-------------+--------+--------+----------------------------------+---------+---------+---------------------------+------+--------------------------+

这是介入"方法:

SELECT id FROM tbl1 WHERE val = 'iii' and id IN (SELECT id FROM tbl2 WHERE val = 'iii');

结果:设置10831行( 1分钟10.15秒)

Result: 10831 rows in set (1 min 10.15 sec)

说明:

+----+--------------------+--------+-----------------+---------------------------------+---------+---------+-------+------+--------------------------+
| id | select_type        | table  | type            | possible_keys                   | key     | key_len | ref   | rows | Extra                    |
+----+--------------------+--------+-----------------+---------------------------------+---------+---------+-------+------+--------------------------+
|  1 | PRIMARY            | tbl1   | ref             | val,val_id                      | val_id  | 138     | const | 8553 | Using where; Using index |
|  2 | DEPENDENT SUBQUERY | tbl2   | unique_subquery | PRIMARY,val,id_val,val_id       | PRIMARY | 8       | func  |    1 | Using where              |
+----+--------------------+--------+-----------------+---------------------------------+---------+---------+-------+------+--------------------------+

所以,这是一个问题:如何调整此查询以使MySQL在第二秒内完成?

So, here is the question: how to tweak this query to let MySQL accomplish it in a second?

推荐答案

好,我已经在每个表的30,000多个记录上对此进行了测试,并且运行速度非常快.

OK I have tested this on 30,000+ records per table and it runs pretty darn quick.

目前,您正在对两个大型表执行联接,但是如果您首先在每个表的'val'上扫描匹配项,则将大大减少联接集的大小.

As it currently stands you're performing a join on two massive tables presently but if you scan for matches on 'val' on each table first that will reduce the size of your join sets substantially.

我最初将此答案作为一组子查询发布,但我没有意识到MySQL在嵌套子查询上的速度很慢,因为它是从外部执行的.但是,如果将子查询定义为视图,则它将由内而外运行

I originally posted this answer as a set of subqueries but I didn't realize that MySQL is painfully slow at nested subqueries since it executes from the outside in. However if you define the subqueries as views it runs them from the inside out.

因此,首先创建视图.

CREATE VIEW tbl1_iii AS (
SELECT * FROM tbl1 WHERE val='iii'
);
CREATE VIEW tbl2_iii AS (
SELECT * FROM tbl2 WHERE val='iii'
);

然后运行查询.

SELECT tbl1_iii.id from tbl1_iii,tbl2_iii
WHERE tbl1_iii.id = tbl2_iii.id;

闪电.

这篇关于如何使索引良好的MySQL表有效连接的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆