如何使JOIN查询使用索引? [英] How to make JOIN query use index?
问题描述
我有两个表:
CREATE TABLE `articles` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`title` varchar(1000) DEFAULT NULL,
`last_updated` datetime DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `last_updated` (`last_updated`),
) ENGINE=InnoDB AUTO_INCREMENT=799681 DEFAULT CHARSET=utf8
CREATE TABLE `article_categories` (
`article_id` int(11) NOT NULL DEFAULT '0',
`category_id` int(11) NOT NULL DEFAULT '0',
PRIMARY KEY (`article_id`,`category_id`),
KEY `category_id` (`category_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
这是我的查询:
SELECT a.*
FROM
articles AS a,
article_categories AS c
WHERE
a.id = c.article_id
AND c.category_id = 78
AND a.comment_cnt > 0
AND a.deleted = 0
ORDER BY a.last_updated
LIMIT 100, 20
还有一个EXPLAIN
:
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: a
type: index
possible_keys: PRIMARY
key: last_updated
key_len: 9
ref: NULL
rows: 2040
Extra: Using where
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: c
type: eq_ref
possible_keys: PRIMARY,fandom_id
key: PRIMARY
key_len: 8
ref: db.a.id,const
rows: 1
Extra: Using index
它在第一个表上使用全索引扫描last_updated
进行排序,但不使用y索引进行连接(在解释中为type: index
).这对性能非常不利,并且会杀死整个数据库服务器,因为这是一个非常频繁的查询.
It uses a full index scan of last_updated
on the first table for sorting, but does not use an y index for join (type: index
in explain). This is very bad for performance and kills the whole database server, since this is a very frequent query.
我尝试使用STRAIGHT_JOIN
反转表顺序,但这会给出filesort, using_temporary
,甚至更糟.
I've tried reversing table order with STRAIGHT_JOIN
, but this gives filesort, using_temporary
, which is even worse.
有什么方法可以使mysql同时使用索引进行连接和排序吗?
Is there any way to make mysql use index for join and for sorting at the same time?
===更新===
我真的很绝望.也许某种形式的非规范化可以在这里有所帮助?
I'm really desparate in this. Maybe some kind of denormalization can help here?
推荐答案
如果您有很多类别,则无法使此查询高效.在MySQL
中,没有一个索引可以一次覆盖两个表.
If you have lots of categories, this query cannot be made efficient. No single index can cover two tables at once in MySQL
.
您必须进行反规范化:将last_updated
,has_comments
和deleted
添加到article_categories
:
You have to do denormalization: add last_updated
, has_comments
and deleted
into article_categories
:
CREATE TABLE `article_categories` (
`article_id` int(11) NOT NULL DEFAULT '0',
`category_id` int(11) NOT NULL DEFAULT '0',
`last_updated` timestamp NOT NULL,
`has_comments` boolean NOT NULL,
`deleted` boolean NOT NULL,
PRIMARY KEY (`article_id`,`category_id`),
KEY `category_id` (`category_id`),
KEY `ix_articlecategories_category_comments_deleted_updated` (category_id, has_comments, deleted, last_updated)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
并运行以下查询:
SELECT *
FROM (
SELECT article_id
FROM article_categories
WHERE (category_id, has_comments, deleted) = (78, 1, 0)
ORDER BY
last_updated DESC
LIMIT 100, 20
) q
JOIN articles a
ON a.id = q.article_id
当然,每当更新article
中的相关列时,您也应该更新article_categories
.这可以在触发器中完成.
Of course you should update article_categories
as well whenever you update relevant columns in article
. This can be done in a trigger.
请注意,列has_comments
是布尔值:这将允许使用等式谓词对索引进行单个范围扫描.
Note that the column has_comments
is boolean: this will allow using an equality predicate to make a single range scan over the index.
还要注意,LIMIT
进入子查询.这使MySQL
使用默认情况下不使用的后行查找.请参阅我的博客中有关如何提高性能的文章:
Also note that the LIMIT
goes into the subquery. This makes MySQL
use late row lookups which it does not use by default. See this article in my blog about why do they increase performance:
如果您使用的是SQL Server,则可以在查询中建立可索引的视图,这实际上会生成带有附加字段的article_categories
的非规范化索引副本,并由服务器自动维护.
If you were on SQL Server, you could make an indexable view over your query, which essentially would make a denormalized indexed copy of article_categories
with the additional fields, automatically mainained by the server.
不幸的是,MySQL
不支持此功能,您将必须手动创建这样的表并编写其他代码以使其与基本表保持同步.
Unfortunately, MySQL
does not support this and you will have to create such a table manually and write additional code to keep it in sync with the base tables.
这篇关于如何使JOIN查询使用索引?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!