如何避免“临时使用"?在多对多查询中? [英] How to avoid "Using temporary" in many-to-many queries?

查看:82
本文介绍了如何避免“临时使用"?在多对多查询中?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

此查询非常简单,我要做的就是按last_updated字段的顺序获取给定类别中的所有文章:

This query is very simple, all I want to do, is get all the articles in given category ordered by last_updated field:

SELECT
    `articles`.*
FROM
    `articles`,
    `articles_to_categories`
WHERE
        `articles`.`id` = `articles_to_categories`.`article_id`
        AND `articles_to_categories`.`category_id` = 1
ORDER BY `articles`.`last_updated` DESC
LIMIT 0, 20;

但是它运行非常慢.这是EXPLAIN所说的:

But it runs very slow. Here is what EXPLAIN said:

select_type  table                   type     possible_keys           key         key_len  ref                                rows  Extra
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SIMPLE       articles_to_categories  ref      article_id,category_id  article_id  5        const                              5016  Using where; Using temporary; Using filesort
SIMPLE       articles                eq_ref   PRIMARY                 PRIMARY     4        articles_to_categories.article_id  1

是否可以重写此查询或向我的PHP脚本中添加其他逻辑以避免Using temporary; Using filesort并加快速度?

Is there a way to rewrite this query or add additional logic to my PHP scripts to avoid Using temporary; Using filesort and speed thing up?

表结构:

*articles*
id | title | content | last_updated

*articles_to_categories*
article_id | category_id

更新

我已索引last_updated.我猜我的情况在d 建议中得到了解释:

I have last_updated indexed. I guess my situation is explained in documentation:

在某些情况下,MySQL无法使用 索引以解析ORDER BY, 尽管它仍然使用索引来查找 与WHERE子句匹配的行. 这些情况包括:

In some cases, MySQL cannot use indexes to resolve the ORDER BY, although it still uses indexes to find the rows that match the WHERE clause. These cases include the following:

用于获取行的键与在ORDER BY中使用的键不同: SELECT * FROM t1 WHERE key2 = constant ORDER BY key1;

The key used to fetch the rows is not the same as the one used in the ORDER BY: SELECT * FROM t1 WHERE key2=constant ORDER BY key1;

您要加入许多表,并且 ORDER BY中的列并非全部 从第一个非恒定表中 用于检索行. (这是 EXPLAIN输出中的第一个表是 没有const连接类型.)

You are joining many tables, and the columns in the ORDER BY are not all from the first nonconstant table that is used to retrieve rows. (This is the first table in the EXPLAIN output that does not have a const join type.)

但是我仍然不知道如何解决这个问题.

but I still have no idea how to fix this.

推荐答案

这是我之前一段时间为一个与性能相关的类似问题所做的简化示例,该问题利用了innodb集群主键索引(显然仅适用于innodb!)

Here's a simplified example I did for a similar performance related question sometime ago that takes advantage of innodb clustered primary key indexes (obviously only available with innodb !!)

  • http://dev.mysql.com/doc/refman/5.0/en/innodb-index-types.html
  • http://www.xaprb.com/blog/2006/07/04/how-to-exploit-mysql-index-optimizations/

您有3个表:category,product和product_category如下:

You have 3 tables: category, product and product_category as follows:

drop table if exists product;
create table product
(
prod_id int unsigned not null auto_increment primary key,
name varchar(255) not null unique
)
engine = innodb; 

drop table if exists category;
create table category
(
cat_id mediumint unsigned not null auto_increment primary key,
name varchar(255) not null unique
)
engine = innodb; 

drop table if exists product_category;
create table product_category
(
cat_id mediumint unsigned not null,
prod_id int unsigned not null,
primary key (cat_id, prod_id) -- **note the clustered composite index** !!
)
engine = innodb;

最重要的是product_catgeory群集组合主键的顺序,因为此情况下的典型查询始终以(x,y,z ...)中的cat_id = x或cat_id开头.

The most import thing is the order of the product_catgeory clustered composite primary key as typical queries for this scenario always lead by cat_id = x or cat_id in (x,y,z...).

我们有 500K 类别, 100万个产品和 1.25亿个产品类别.

We have 500K categories, 1 million products and 125 million product categories.

select count(*) from category;
+----------+
| count(*) |
+----------+
|   500000 |
+----------+

select count(*) from product;
+----------+
| count(*) |
+----------+
|  1000000 |
+----------+

select count(*) from product_category;
+-----------+
| count(*)  |
+-----------+
| 125611877 |
+-----------+

因此,让我们看一下该架构如何执行与您的查询类似的查询.所有查询均在空缓冲区且无查询缓存的情况下冷运行(在mysql重新启动后).

So let's see how this schema performs for a query similar to yours. All queries are run cold (after mysql restart) with empty buffers and no query caching.

select
 p.*
from
 product p
inner join product_category pc on 
    pc.cat_id = 4104 and pc.prod_id = p.prod_id
order by
 p.prod_id desc -- sry dont a date field in this sample table - wont make any difference though
limit 20;

+---------+----------------+
| prod_id | name           |
+---------+----------------+
|  993561 | Product 993561 |
|  991215 | Product 991215 |
|  989222 | Product 989222 |
|  986589 | Product 986589 |
|  983593 | Product 983593 |
|  982507 | Product 982507 |
|  981505 | Product 981505 |
|  981320 | Product 981320 |
|  978576 | Product 978576 |
|  973428 | Product 973428 |
|  959384 | Product 959384 |
|  954829 | Product 954829 |
|  953369 | Product 953369 |
|  951891 | Product 951891 |
|  949413 | Product 949413 |
|  947855 | Product 947855 |
|  947080 | Product 947080 |
|  945115 | Product 945115 |
|  943833 | Product 943833 |
|  942309 | Product 942309 |
+---------+----------------+
20 rows in set (0.70 sec) 

explain
select
 p.*
from
 product p
inner join product_category pc on 
    pc.cat_id = 4104 and pc.prod_id = p.prod_id
order by
 p.prod_id desc -- sry dont a date field in this sample table - wont make any diference though
limit 20;

+----+-------------+-------+--------+---------------+---------+---------+------------------+------+----------------------------------------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref           | rows | Extra                                        |
+----+-------------+-------+--------+---------------+---------+---------+------------------+------+----------------------------------------------+
|  1 | SIMPLE      | pc    | ref    | PRIMARY       | PRIMARY | 3       | const           |  499 | Using index; Using temporary; Using filesort |
|  1 | SIMPLE      | p     | eq_ref | PRIMARY       | PRIMARY | 4       | vl_db.pc.prod_id |    1 |                                              |
+----+-------------+-------+--------+---------------+---------+---------+------------------+------+----------------------------------------------+
2 rows in set (0.00 sec)

这就是0.70秒的寒冷-哎呀.

So that's 0.70 seconds cold - ouch.

希望这会有所帮助:)

编辑

刚刚阅读了您对我的评论的回复,看来您有两个选择之一:

Having just read your reply to my comment above it seems you have one of two choices to make:

create table articles_to_categories
(
article_id int unsigned not null,
category_id mediumint unsigned not null,
primary key(article_id, category_id), -- good for queries that lead with article_id = x
key (category_id)
)
engine=innodb;

或.

create table categories_to_articles
(
article_id int unsigned not null,
category_id mediumint unsigned not null,
primary key(category_id, article_id), -- good for queries that lead with category_id = x
key (article_id)
)
engine=innodb;

取决于关于如何定义群集PK的典型查询.

depends on your typical queries as to how you define your clustered PK.

这篇关于如何避免“临时使用"?在多对多查询中?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆