MySQL性能优化:按日期时间字段排序 [英] MySQL performance optimization: order by datetime field
问题描述
我有一个表,该表包含大约100.000个博客帖子,并通过1:n关系链接到具有50个供稿的表.当我用select语句查询两个表时(按发布表的datetime字段排序),MySQL始终使用文件排序,导致查询时间非常慢(> 1秒).这是postings
表的结构(简化):
I have a table with roughly 100.000 blog postings, linked to a table with 50 feeds via an 1:n relationship. When I query both tables with a select statement, ordered by a datetime field of the postings table, MySQL always uses filesort, resulting in very slow query times (>1 second). Here's the schema of the postings
table (simplified):
+---------------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| feed_id | int(11) | NO | MUL | NULL | |
| crawl_date | datetime | NO | | NULL | |
| is_active | tinyint(1) | NO | MUL | 0 | |
| link | varchar(255) | NO | MUL | NULL | |
| author | varchar(255) | NO | | NULL | |
| title | varchar(255) | NO | | NULL | |
| excerpt | text | NO | | NULL | |
| long_excerpt | text | NO | | NULL | |
| user_offtopic_count | int(11) | NO | MUL | 0 | |
+---------------------+--------------+------+-----+---------+----------------+
这是feed
表:
+-------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| type | int(11) | NO | MUL | 0 | |
| title | varchar(255) | NO | | NULL | |
| website | varchar(255) | NO | | NULL | |
| url | varchar(255) | NO | | NULL | |
+-------------+--------------+------+-----+---------+----------------+
这是需要1秒钟以上的时间才能执行的查询.请注意,post_date
字段具有索引,但是MySQL并未使用它对发布表进行排序:
And here's the query that takes >1 second to execute. Please note that the post_date
field has an index, but MySQL isn't using it to sort the postings table:
SELECT
`postings`.`id`,
UNIX_TIMESTAMP(postings.post_date) as post_date,
`postings`.`link`,
`postings`.`title`,
`postings`.`author`,
`postings`.`excerpt`,
`postings`.`long_excerpt`,
`feeds`.`title` AS feed_title,
`feeds`.`website` AS feed_website
FROM
(`postings`)
JOIN
`feeds`
ON
`feeds`.`id` = `postings`.`feed_id`
WHERE
`feeds`.`type` = 1 AND
`postings`.`user_offtopic_count` < 10 AND
`postings`.`is_active` = 1
ORDER BY
`postings`.`post_date` desc
LIMIT
15
此查询中explain extended
命令的结果表明MySQL正在使用文件排序:
The result of the explain extended
command on this query shows that MySQL is using filesort:
+----+-------------+----------+--------+---------------------------------------+-----------+---------+--------------------------+-------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+--------+---------------------------------------+-----------+---------+--------------------------+-------+-----------------------------+
| 1 | SIMPLE | postings | ref | feed_id,is_active,user_offtopic_count | is_active | 1 | const | 30996 | Using where; Using filesort |
| 1 | SIMPLE | feeds | eq_ref | PRIMARY,type | PRIMARY | 4 | feedian.postings.feed_id | 1 | Using where |
+----+-------------+----------+--------+---------------------------------------+-----------+---------+--------------------------+-------+-----------------------------+
当我删除order by
部分时,MySQL停止使用文件排序.如果您对如何优化此查询以使MySQL通过使用索引来排序和选择数据有任何想法,请告诉我.我已经尝试了一些方法,例如根据一些博客文章的建议,在所有按位置/按字段排序的位置上创建了一个组合索引,但这还是行不通的.
When I remove the order by
part, MySQL stops using filesort. Please let me know if you have any ideas on how to optimize this query to get MySQL to sort and select the data by using indexes. I have already tried a few things such as creating a combined index on all where/order by fields, as suggested by a few blog postings, but this didn't work either.
推荐答案
在postings (is_active, post_date)
上(按此顺序)创建一个复合索引.
Create a composite index either on postings (is_active, post_date)
(in that order).
它将用于在is_active
上进行过滤和按post_date
进行排序.
It will be used both for filtering on is_active
and ordering by post_date
.
MySQL
应该在EXPLAIN EXTENDED
中的该索引上显示REF
访问方法.
MySQL
should show REF
access method over this index in EXPLAIN EXTENDED
.
请注意,您在user_offtopic_count
上具有RANGE
过滤条件,这就是为什么在过滤和按其他字段排序时都不能在该字段上使用索引的原因.
Note that you have a RANGE
filtering condition over user_offtopic_count
, that's why you cannot use an index over this field both in filtering and in sorting by other field.
根据您的user_offtopic_count
的选择性(即,有多少行满足user_offtopic_count < 10
),在user_offtopic_count
上创建索引并对post_dates进行排序可能会更有用.
Depending on how selective is your user_offtopic_count
(i. e. how many rows satisfy user_offtopic_count < 10
), it may be more useful to create an index on user_offtopic_count
and let the post_dates be sorted.
为此,请在postings (is_active, user_offtopic_count)
上创建一个复合索引,并确保使用对该索引的RANGE
访问方法.
To do this, create a composite index on postings (is_active, user_offtopic_count)
and make sure the RANGE
access method over this index is used.
哪个索引会更快取决于您的数据分布.创建两个索引,FORCE
它们,看看哪个更快:
Which index will be faster depends on your data distribuion. Create both indexes, FORCE
them and see which is faster:
CREATE INDEX ix_active_offtopic ON postings (is_active, user_offtopic_count);
CREATE INDEX ix_active_date ON postings (is_active, post_date);
SELECT
`postings`.`id`,
UNIX_TIMESTAMP(postings.post_date) as post_date,
`postings`.`link`,
`postings`.`title`,
`postings`.`author`,
`postings`.`excerpt`,
`postings`.`long_excerpt`,
`feeds`.`title` AS feed_title,
`feeds`.`website` AS feed_website
FROM
`postings` FORCE INDEX (ix_active_offtopic)
JOIN
`feeds`
ON
`feeds`.`id` = `postings`.`feed_id`
WHERE
`feeds`.`type` = 1 AND
`postings`.`user_offtopic_count` < 10 AND
`postings`.`is_active` = 1
ORDER BY
`postings`.`post_date` desc
LIMIT
15
/* This should show RANGE access with few rows and keep the FILESORT */
SELECT
`postings`.`id`,
UNIX_TIMESTAMP(postings.post_date) as post_date,
`postings`.`link`,
`postings`.`title`,
`postings`.`author`,
`postings`.`excerpt`,
`postings`.`long_excerpt`,
`feeds`.`title` AS feed_title,
`feeds`.`website` AS feed_website
FROM
`postings` FORCE INDEX (ix_active_date)
JOIN
`feeds`
ON
`feeds`.`id` = `postings`.`feed_id`
WHERE
`feeds`.`type` = 1 AND
`postings`.`user_offtopic_count` < 10 AND
`postings`.`is_active` = 1
ORDER BY
`postings`.`post_date` desc
LIMIT
15
/* This should show REF access with lots of rows and no FILESORT */
这篇关于MySQL性能优化:按日期时间字段排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!