优化MySQL查询以避免扫描很多行 [英] Optimizing MySQL query to avoid scanning a lot of rows

查看:463
本文介绍了优化MySQL查询以避免扫描很多行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在运行一个使用与下表相似的表的应用程序.文章有一个表,标签有另一个表.我想按文章ID获取特定标签顺序的最新30篇文章.例如"acer",下面的查询即可完成工作,但索引不正确,因为如果有很多与特定标签相关的文章,它将扫描很多行.如何在不扫描大量行的情况下运行查询以获取相同的结果?

I am running an application that is using tables similar to the below tables. There are one tables for articles and there is another table for tags. I want to get the latest 30 articles for a specific tag order by article id. for example "acer", the below query will do the job but it is not indexed correctly because it will scan a lot of rows if there are a lot of articles related to a specific tag. How to run a query to get the same result without scanning a large number of rows?

EXPLAIN SELECT title
FROM tag, article
WHERE tag = 'acer'
AND tag.article_id = article.id
ORDER BY tag.article_id DESC 
LIMIT 0 , 30 

输出

id  select_type     table   type    possible_keys   key     key_len     ref     rows    Extra
1   SIMPLE  tag     ref     tag     tag     92  const   220439  Using where; Using index
1   SIMPLE  article     eq_ref  PRIMARY     PRIMARY     4   testdb.tag.article_id   1 

以下是表格和示例数据:

The flollowing is the tables and sample data:

CREATE TABLE `article` (
  `id` int(11) NOT NULL auto_increment,
  `title` varchar(60) NOT NULL,
  `time_stamp` int(11) NOT NULL,
  PRIMARY KEY  (`id`)
) ENGINE=MyISAM  DEFAULT CHARSET=utf8 AUTO_INCREMENT=1000001 ;

-- 
-- Dumping data for table `article`
-- 

INSERT INTO `article` VALUES (1, 'Saudi Apple type D', 1313390211);
INSERT INTO `article` VALUES (2, 'Japan Apple type A', 1313420771);
INSERT INTO `article` VALUES (3, 'UAE Samsung type B', 1313423082);
INSERT INTO `article` VALUES (4, 'UAE Apple type H', 1313417337);
INSERT INTO `article` VALUES (5, 'Japan Samsung type D', 1313398875);
INSERT INTO `article` VALUES (6, 'UK Acer type B', 1313387888);
INSERT INTO `article` VALUES (7, 'Saudi Sony type D', 1313429416);
INSERT INTO `article` VALUES (8, 'UK Apple type B', 1313394549);
INSERT INTO `article` VALUES (9, 'Japan HP type A', 1313427730);
INSERT INTO `article` VALUES (10, 'Japan Acer type C', 1313400046);



CREATE TABLE `tag` (
  `tag` varchar(30) NOT NULL,
  `article_id` int(11) NOT NULL,
  UNIQUE KEY `tag` (`tag`,`article_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

-- 
-- Dumping data for table `tag`
-- 


INSERT INTO `tag` VALUES ('Samsung', 1);
INSERT INTO `tag` VALUES ('Acer', 2);
INSERT INTO `tag` VALUES ('Sony', 3);
INSERT INTO `tag` VALUES ('Apple', 4);
INSERT INTO `tag` VALUES ('Acer', 5);
INSERT INTO `tag` VALUES ('HP', 6);
INSERT INTO `tag` VALUES ('Acer', 7);
INSERT INTO `tag` VALUES ('Sony', 7);
INSERT INTO `tag` VALUES ('Acer', 7);
INSERT INTO `tag` VALUES ('Samsung', 9);

推荐答案

是什么让您认为查询将检查大量行?

What makes you think the query will examine a large number of rows?

查询将使用tag (tag, article_id)上的UNIQUE索引精确扫描30记录,将文章加入PRIMARY KEY上的每个记录,然后停止.

The query will scan exactly 30 records using the UNIQUE index on tag (tag, article_id), join the article to each record on PRIMARY KEY and stop.

这正是您的计划所说的.

This is exactly what your plan says.

我刚刚制作了这个测试脚本:

I just made this test script:

CREATE TABLE `article` (
  `id` int(11) NOT NULL auto_increment,
  `title` varchar(60) NOT NULL,
  `time_stamp` int(11) NOT NULL,
  PRIMARY KEY  (`id`)
) ENGINE=MyISAM  DEFAULT CHARSET=utf8 AUTO_INCREMENT=1000001 ;

CREATE TABLE `tag` (
  `tag` varchar(30) NOT NULL,
  `article_id` int(11) NOT NULL,
  UNIQUE KEY `tag` (`tag`,`article_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

INSERT
INTO    article
SELECT  id, CONCAT('Article ', id), UNIX_TIMESTAMP('2011-08-17' - INTERVAL id SECOND)
FROM    t_source;

INSERT
INTO    tag
SELECT  CASE fld WHEN 1 THEN CONCAT('tag', (id - 1) div 10 + 1) ELSE tag END AS tag, id
FROM    (
        SELECT  tag,
                id,
                FIELD(tag, 'Other', 'Acer', 'Sony', 'HP', 'Dell') AS fld,
                RAND(20110817) AS rnd
        FROM    (
                SELECT  'Other' AS tag
                UNION ALL
                SELECT  'Acer' AS tag
                UNION ALL
                SELECT  'Sony' AS tag
                UNION ALL
                SELECT  'HP' AS tag
                UNION ALL
                SELECT  'Dell' AS tag
                ) t
        JOIN    t_source
        ) q
WHERE   POWER(3, -fld) > rnd;

,其中t_source是其中包含1M记录的表,然后运行查询:

, where t_source is a table with 1M records in it, and run your query:

SELECT  *
FROM    tag t
JOIN    article a
ON      a.id = t.article_id
WHERE   t.tag = 'acer'
ORDER BY
        t.article_id DESC
LIMIT 30;

那是瞬间.

这篇关于优化MySQL查询以避免扫描很多行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆