MySQL性能优化:按日期时间字段排序 [英] MySQL performance optimization: order by datetime field

查看:107
本文介绍了MySQL性能优化:按日期时间字段排序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个表,该表包含大约100.000个博客帖子,并通过1:n关系链接到具有50个供稿的表.当我用select语句查询两个表时(按发布表的datetime字段排序),MySQL始终使用文件排序,导致查询时间非常慢(> 1秒).这是postings表的结构(简化):

I have a table with roughly 100.000 blog postings, linked to a table with 50 feeds via an 1:n relationship. When I query both tables with a select statement, ordered by a datetime field of the postings table, MySQL always uses filesort, resulting in very slow query times (>1 second). Here's the schema of the postings table (simplified):

+---------------------+--------------+------+-----+---------+----------------+
| Field               | Type         | Null | Key | Default | Extra          |
+---------------------+--------------+------+-----+---------+----------------+
| id                  | int(11)      | NO   | PRI | NULL    | auto_increment |
| feed_id             | int(11)      | NO   | MUL | NULL    |                |
| crawl_date          | datetime     | NO   |     | NULL    |                |
| is_active           | tinyint(1)   | NO   | MUL | 0       |                |
| link                | varchar(255) | NO   | MUL | NULL    |                |
| author              | varchar(255) | NO   |     | NULL    |                |
| title               | varchar(255) | NO   |     | NULL    |                |
| excerpt             | text         | NO   |     | NULL    |                |
| long_excerpt        | text         | NO   |     | NULL    |                |
| user_offtopic_count | int(11)      | NO   | MUL | 0       |                |
+---------------------+--------------+------+-----+---------+----------------+

这是feed表:

+-------------+--------------+------+-----+---------+----------------+
| Field       | Type         | Null | Key | Default | Extra          |
+-------------+--------------+------+-----+---------+----------------+
| id          | int(11)      | NO   | PRI | NULL    | auto_increment |
| type        | int(11)      | NO   | MUL | 0       |                |
| title       | varchar(255) | NO   |     | NULL    |                |
| website     | varchar(255) | NO   |     | NULL    |                |
| url         | varchar(255) | NO   |     | NULL    |                |
+-------------+--------------+------+-----+---------+----------------+

这是需要1秒钟以上的时间才能执行的查询.请注意,post_date字段具有索引,但是MySQL并未使用它对发布表进行排序:

And here's the query that takes >1 second to execute. Please note that the post_date field has an index, but MySQL isn't using it to sort the postings table:

SELECT 
    `postings`.`id`, 
    UNIX_TIMESTAMP(postings.post_date) as post_date, 
    `postings`.`link`, 
    `postings`.`title`, 
    `postings`.`author`, 
    `postings`.`excerpt`, 
    `postings`.`long_excerpt`, 
    `feeds`.`title` AS feed_title, 
    `feeds`.`website` AS feed_website
FROM 
    (`postings`)
JOIN 
    `feeds` 
ON 
    `feeds`.`id` = `postings`.`feed_id`
WHERE 
    `feeds`.`type` = 1 AND 
    `postings`.`user_offtopic_count` < 10 AND 
    `postings`.`is_active` = 1
ORDER BY 
    `postings`.`post_date` desc
LIMIT 
    15  

此查询中explain extended命令的结果表明MySQL正在使用文件排序:

The result of the explain extended command on this query shows that MySQL is using filesort:

+----+-------------+----------+--------+---------------------------------------+-----------+---------+--------------------------+-------+-----------------------------+
| id | select_type | table    | type   | possible_keys                         | key       | key_len | ref                      | rows  | Extra                       |
+----+-------------+----------+--------+---------------------------------------+-----------+---------+--------------------------+-------+-----------------------------+
|  1 | SIMPLE      | postings | ref    | feed_id,is_active,user_offtopic_count | is_active | 1       | const                    | 30996 | Using where; Using filesort |
|  1 | SIMPLE      | feeds    | eq_ref | PRIMARY,type                          | PRIMARY   | 4       | feedian.postings.feed_id |     1 | Using where                 |
+----+-------------+----------+--------+---------------------------------------+-----------+---------+--------------------------+-------+-----------------------------+

当我删除order by部分时,MySQL停止使用文件排序.如果您对如何优化此查询以使MySQL通过使用索引来排序和选择数据有任何想法,请告诉我.我已经尝试了一些方法,例如根据一些博客文章的建议,在所有按位置/按字段排序的位置上创建了一个组合索引,但这还是行不通的.

When I remove the order by part, MySQL stops using filesort. Please let me know if you have any ideas on how to optimize this query to get MySQL to sort and select the data by using indexes. I have already tried a few things such as creating a combined index on all where/order by fields, as suggested by a few blog postings, but this didn't work either.

推荐答案

postings (is_active, post_date)上(按此顺序)创建一个复合索引.

Create a composite index either on postings (is_active, post_date) (in that order).

它将用于在is_active上进行过滤和按post_date进行排序.

It will be used both for filtering on is_active and ordering by post_date.

MySQL应该在EXPLAIN EXTENDED中的该索引上显示REF访问方法.

MySQL should show REF access method over this index in EXPLAIN EXTENDED.

请注意,您在user_offtopic_count上具有RANGE过滤条件,这就是为什么在过滤和按其他字段排序时都不能在该字段上使用索引的原因.

Note that you have a RANGE filtering condition over user_offtopic_count, that's why you cannot use an index over this field both in filtering and in sorting by other field.

根据您的user_offtopic_count的选择性(即,有多少行满足user_offtopic_count < 10),在user_offtopic_count上创建索引并对post_dates进行排序可能会更有用.

Depending on how selective is your user_offtopic_count (i. e. how many rows satisfy user_offtopic_count < 10), it may be more useful to create an index on user_offtopic_count and let the post_dates be sorted.

为此,请在postings (is_active, user_offtopic_count)上创建一个复合索引,并确保使用对该索引的RANGE访问方法.

To do this, create a composite index on postings (is_active, user_offtopic_count) and make sure the RANGE access method over this index is used.

哪个索引会更快取决于您的数据分布.创建两个索引,FORCE它们,看看哪个更快:

Which index will be faster depends on your data distribuion. Create both indexes, FORCE them and see which is faster:

CREATE INDEX ix_active_offtopic ON postings (is_active, user_offtopic_count);
CREATE INDEX ix_active_date ON postings (is_active, post_date);

SELECT 
    `postings`.`id`, 
    UNIX_TIMESTAMP(postings.post_date) as post_date, 
    `postings`.`link`, 
    `postings`.`title`, 
    `postings`.`author`, 
    `postings`.`excerpt`, 
    `postings`.`long_excerpt`, 
    `feeds`.`title` AS feed_title, 
    `feeds`.`website` AS feed_website
FROM 
    `postings` FORCE INDEX (ix_active_offtopic)
JOIN 
    `feeds` 
ON 
    `feeds`.`id` = `postings`.`feed_id`
WHERE 
    `feeds`.`type` = 1 AND 
    `postings`.`user_offtopic_count` < 10 AND 
    `postings`.`is_active` = 1
ORDER BY 
    `postings`.`post_date` desc
LIMIT 
    15

/* This should show RANGE access with few rows and keep the FILESORT */

SELECT 
    `postings`.`id`, 
    UNIX_TIMESTAMP(postings.post_date) as post_date, 
    `postings`.`link`, 
    `postings`.`title`, 
    `postings`.`author`, 
    `postings`.`excerpt`, 
    `postings`.`long_excerpt`, 
    `feeds`.`title` AS feed_title, 
    `feeds`.`website` AS feed_website
FROM 
    `postings` FORCE INDEX (ix_active_date)
JOIN 
    `feeds` 
ON 
    `feeds`.`id` = `postings`.`feed_id`
WHERE 
    `feeds`.`type` = 1 AND 
    `postings`.`user_offtopic_count` < 10 AND 
    `postings`.`is_active` = 1
ORDER BY 
    `postings`.`post_date` desc
LIMIT 
    15

/* This should show REF access with lots of rows and no FILESORT */

这篇关于MySQL性能优化:按日期时间字段排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆