如何使JOIN查询使用索引? [英] How to make JOIN query use index?

查看:555
本文介绍了如何使JOIN查询使用索引?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个表:

CREATE TABLE `articles` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `title` varchar(1000) DEFAULT NULL,
  `last_updated` datetime DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `last_updated` (`last_updated`),
) ENGINE=InnoDB AUTO_INCREMENT=799681 DEFAULT CHARSET=utf8 

CREATE TABLE `article_categories` (
  `article_id` int(11) NOT NULL DEFAULT '0',
  `category_id` int(11) NOT NULL DEFAULT '0',
  PRIMARY KEY (`article_id`,`category_id`),
  KEY `category_id` (`category_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |

这是我的查询:

SELECT a.*
FROM
    articles AS a,
    article_categories AS c
WHERE
    a.id = c.article_id
    AND c.category_id = 78
    AND a.comment_cnt > 0
    AND a.deleted = 0
ORDER BY a.last_updated
LIMIT 100, 20

还有一个EXPLAIN:

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: a
         type: index
possible_keys: PRIMARY
          key: last_updated
      key_len: 9
          ref: NULL
         rows: 2040
        Extra: Using where
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: c
         type: eq_ref
possible_keys: PRIMARY,fandom_id
          key: PRIMARY
      key_len: 8
          ref: db.a.id,const
         rows: 1
        Extra: Using index

它在第一个表上使用全索引扫描last_updated进行排序,但不使用y索引进行连接(在解释中为type: index).这对性能非常不利,并且会杀死整个数据库服务器,因为这是一个非常频繁的查询.

It uses a full index scan of last_updated on the first table for sorting, but does not use an y index for join (type: index in explain). This is very bad for performance and kills the whole database server, since this is a very frequent query.

我尝试使用STRAIGHT_JOIN反转表顺序,但这会给出filesort, using_temporary,甚至更糟.

I've tried reversing table order with STRAIGHT_JOIN, but this gives filesort, using_temporary, which is even worse.

有什么方法可以使mysql同时使用索引进行连接和排序吗?

Is there any way to make mysql use index for join and for sorting at the same time?

===更新===

我真的很绝望.也许某种形式的非规范化可以在这里有所帮助?

I'm really desparate in this. Maybe some kind of denormalization can help here?

推荐答案

如果您有很多类别,则无法使此查询高效.在MySQL中,没有一个索引可以一次覆盖两个表.

If you have lots of categories, this query cannot be made efficient. No single index can cover two tables at once in MySQL.

您必须进行反规范化:将last_updatedhas_commentsdeleted添加到article_categories:

You have to do denormalization: add last_updated, has_comments and deleted into article_categories:

CREATE TABLE `article_categories` (
  `article_id` int(11) NOT NULL DEFAULT '0',
  `category_id` int(11) NOT NULL DEFAULT '0',
  `last_updated` timestamp NOT NULL,
  `has_comments` boolean NOT NULL,
  `deleted` boolean NOT NULL,
  PRIMARY KEY (`article_id`,`category_id`),
  KEY `category_id` (`category_id`),
  KEY `ix_articlecategories_category_comments_deleted_updated` (category_id, has_comments, deleted, last_updated)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

并运行以下查询:

SELECT  *
FROM    (
        SELECT  article_id
        FROM    article_categories
        WHERE   (category_id, has_comments, deleted) = (78, 1, 0)
        ORDER BY
                last_updated DESC
        LIMIT   100, 20
        ) q
JOIN    articles a
ON      a.id = q.article_id

当然,每当更新article中的相关列时,您也应该更新article_categories.这可以在触发器中完成.

Of course you should update article_categories as well whenever you update relevant columns in article. This can be done in a trigger.

请注意,列has_comments是布尔值:这将允许使用等式谓词对索引进行单个范围扫描.

Note that the column has_comments is boolean: this will allow using an equality predicate to make a single range scan over the index.

还要注意,LIMIT进入子查询.这使MySQL使用默认情况下不使用的后行查找.请参阅我的博客中有关如何提高性能的文章:

Also note that the LIMIT goes into the subquery. This makes MySQL use late row lookups which it does not use by default. See this article in my blog about why do they increase performance:

如果您使用的是SQL Server,则可以在查询中建立可索引的视图,这实际上会生成带有附加字段的article_categories的非规范化索引副本,并由服务器自动维护.

If you were on SQL Server, you could make an indexable view over your query, which essentially would make a denormalized indexed copy of article_categories with the additional fields, automatically mainained by the server.

不幸的是,MySQL不支持此功能,您将必须手动创建这样的表并编写其他代码以使其与基本表保持同步.

Unfortunately, MySQL does not support this and you will have to create such a table manually and write additional code to keep it in sync with the base tables.

这篇关于如何使JOIN查询使用索引?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆