具有范围条件的联接表上的MySQL优化 [英] MySQL optimization on join tables with range criteria
问题描述
我将通过使用一个表中的单个位置到另一个表中的范围(由两列表示)来联接两个表.
I am going to join two tables by using a single position in one table to the range (represented by two columns) in another table.
但是,性能太慢,大约需要20分钟. 我尝试在表上添加索引或更改查询. 但是性能仍然很差.
However, the performance is too slow, which is about 20 mins. I have tried adding the index on the table or changing the query. But the performance is still poor.
所以,我要求优化加入速度.
So, I am asking for optimization of the joining speed.
以下是对MySQL的查询.
The following is the query to MySQL.
mysql> SELECT `inVar`.chrom, `inVar`.pos, `openChrom_K562`.score
-> FROM `inVar`
-> LEFT JOIN `openChrom_K562`
-> ON (
-> `inVar`.chrom=`openChrom_K562`.chrom AND
-> `inVar`.pos BETWEEN `openChrom_K562`.chromStart AND `openChrom_K562`.chromEnd
-> );
inVar
和openChrom_K562
是我使用的表.
inVar
在每一行中存储单个位置.
inVar
stores the single position in each row.
openChrom_K562
存储由chromStart
和chromEnd
指示的范围信息.
openChrom_K562
stores the range information indicated by chromStart
and chromEnd
.
inVar
包含57902行,openChrom_K562
包含137373行.
inVar
contains 57902 rows and openChrom_K562
has 137373 rows respectively.
表格上的字段.
mysql> DESCRIBE inVar;
+-------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+-------+
| chrom | varchar(31) | NO | PRI | NULL | |
| pos | int(10) | NO | PRI | NULL | |
+-------+-------------+------+-----+---------+-------+
mysql> DESCRIBE openChrom_K562;
+------------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+------------+-------------+------+-----+---------+-------+
| chrom | varchar(31) | NO | MUL | NULL | |
| chromStart | int(10) | NO | MUL | NULL | |
| chromEnd | int(10) | NO | | NULL | |
| score | int(10) | NO | | NULL | |
+------------+-------------+------+-----+---------+-------+
表格中内置的索引
Index built in the tables
mysql> SHOW INDEX FROM inVar;
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| inVar | 0 | PRIMARY | 1 | chrom | A | NULL | NULL | NULL | | BTREE | |
| inVar | 0 | PRIMARY | 2 | pos | A | 57902 | NULL | NULL | | BTREE | |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
mysql> SHOW INDEX FROM openChrom_K562;
+----------------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+----------------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| openChrom_K562 | 1 | start_end | 1 | chromStart | A | 137373 | NULL | NULL | | BTREE | |
| openChrom_K562 | 1 | start_end | 2 | chromEnd | A | 137373 | NULL | NULL | | BTREE | |
| openChrom_K562 | 1 | chrom_only | 1 | chrom | A | 22 | NULL | NULL | | BTREE | |
| openChrom_K562 | 1 | chrom_start | 1 | chrom | A | 22 | NULL | NULL | | BTREE | |
| openChrom_K562 | 1 | chrom_start | 2 | chromStart | A | 137373 | NULL | NULL | | BTREE | |
| openChrom_K562 | 1 | chrom_end | 1 | chrom | A | 22 | NULL | NULL | | BTREE | |
| openChrom_K562 | 1 | chrom_end | 2 | chromEnd | A | 137373 | NULL | NULL | | BTREE | |
+----------------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
在MySQL上的执行计划
Execution plan on MySQL
mysql> EXPLAIN SELECT `inVar`.chrom, `inVar`.pos, score FROM `inVar` LEFT JOIN `openChrom_K562` ON ( inVar.chrom=openChrom_K562.chrom AND `inVar`.pos BETWEEN chromStart AND chromEnd );
+----+-------------+----------------+-------+--------------------------------------------+------------+---------+-----------------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------------+-------+--------------------------------------------+------------+---------+-----------------+-------+-------------+
| 1 | SIMPLE | inVar | index | NULL | PRIMARY | 37 | NULL | 57902 | Using index |
| 1 | SIMPLE | openChrom_K562 | ref | start_end,chrom_only,chrom_start,chrom_end | chrom_only | 33 | tmp.inVar.chrom | 5973 | |
+----+-------------+----------------+-------+--------------------------------------------+------------+---------+-----------------+-------+-------------+
似乎只能通过在两个表中查找chrom
进行优化.然后在表中进行蛮力比较.
It seems it only optimizes by looking chrom
in two tables. Then do the brute-force comparing in the tables.
有什么方法可以做进一步的优化,例如在位置上建立索引?
Is there any ways to do the further optimization like indexing on the position?
(这是我第一次发布问题,抱歉发布质量不佳.)
推荐答案
chrom_only
对于您的联接可能是一个错误的索引选择,因为您只有chrom 22值.
chrom_only
is likely to be a bad index selection for your join as you only have chrom 22 values.
如果我已经解释了这一权利,那么使用start_end
If I have interpreted this right the query should be faster if using start_end
SELECT `inVar`.chrom, `inVar`.pos, `openChrom_K562`.score
FROM `inVar`
LEFT JOIN `openChrom_K562`
USE INDEX (`start_end`)
ON (
`inVar`.chrom=`openChrom_K562`.chrom AND
`inVar`.pos BETWEEN `openChrom_K562`.chromStart AND `openChrom_K562`.chromEnd
)
这篇关于具有范围条件的联接表上的MySQL优化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!