Mysql慢查询:JOIN +多个WHERES + ORDER BY [英] Mysql slow query: JOIN + multiple WHERES + ORDER BY

查看:207
本文介绍了Mysql慢查询:JOIN +多个WHERES + ORDER BY的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

长时间潜伏,第一个问题!

long time lurker, first question!

我在努力优化此查询,选择与所选过滤器匹配的价格最低的项目:

I am struggling to optimize this query, which selects the lowest priced items that match the chosen filters:

SELECT product_info.*, MIN(product_all.sale_price) as sale_price, product_all.buy_link
FROM product_info
NATURAL JOIN (SELECT * FROM product_all WHERE product_all.date = '2010-09-30') as product_all
WHERE (product_info.category = 2  
AND product_info.gender = 'W' )
GROUP BY product_all.prod_id
ORDER BY MIN(product_all.sale_price) ASC LIMIT 13

p>

Its explain:

| id | select_type | table        | type   | possible_keys                                             | key     | key_len | ref                 | rows   | Extra                           |  
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+  
|  1 | PRIMARY     | <derived2>   | ALL    | NULL                                                     | NULL    | NULL    | NULL                | 89801  | Using temporary; Using filesort | 
|  1 | PRIMARY     | product_info | eq_ref | PRIMARY,category_prod_id_retail_price,category_ret...     | PRIMARY | 4       | product_all.prod_id | 1      | Using where                     | 
|  2 | DERIVED     | product_all  | ref    | date_2                                                    | date_2  | 3       |                     | 144107 |                                 | 

我已经尝试删除子查询,这看起来更好,但在实践中需要更长时间: p>

I've tried eliminating the subquery, which intuitively seems better but in practice takes even longer:

SELECT product_info.*, MIN(product_all.sale_price) as sale_price, product_all.buy_link
FROM product_info
NATURAL JOIN product_all
WHERE (product_all.date = '2010-09-30'
AND product_info.category = 2 
AND product_info.gender = 'W' )
GROUP BY product_all.prod_id
ORDER BY MIN(product_all.sale_price) ASC LIMIT 13

/ p>

And its explain:

| id | select_type | table        | type | possible_keys                                             | key                      | key_len | ref                               | rows | Extra                                        |  
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+  
|  1 | SIMPLE      | product_info | ref  | PRIMARY,category_prod_id_retail_price,category_ret...     | category_retail_price    | 5       | const                             | 269  | Using where; Using temporary; Using filesort | 
|  1 | SIMPLE      | product_all  | ref  | PRIMARY,prod_id,date_2                                    | prod_id                  | 4       | equipster_db.product_info.prod_id | 141  | Using where                                  | 

以下是表格:

CREATE TABLE `product_all` (
`prod_id` INT( 10 ) NOT NULL PRIMARY KEY ,
`ref_id` INT( 10) NOT NULL PRIMARY KEY ,
`date` DATE NOT NULL ,
`buy_link` BLOB NOT NULL ,
`sale_price` FLOAT NOT NULL
) ENGINE = MYISAM ;


CREATE TABLE `product_info` (
`prod_id` INT( 10 ) NOT NULL AUTO_INCREMENT PRIMARY KEY ,
`prod_name` VARCHAR( 200 ) NOT NULL,
`brand` VARCHAR( 50 ) NOT NULL,
`retail_price` FLOAT NOT NULL
`category` INT( 3 ) NOT NULL,
`gender` VARCHAR( 1 ) NOT NULL,
`type` VARCHAR( 10 ) NOT NULL
) ENGINE = MYISAM ;

我的问题:

-

-what索引将优化此查询?

-less重要的是:当添加或删除WHERE子句或使用不同的ORDER BY时,索引方法如何更改,例如按%off进行排序:

My Questions:
-which query structure seems optimal?
-what indices would optimize this query?
-less importantly: how does the indexing approach change when adding or removing WHERE clauses or using a different ORDER BY, such as sorting by % off:

ORDER BY (1-(MIN(product_all.sale_price)/product_info.retail_price)) DESC  

编辑:这两个查询的自然连接操作都是prod_id(product_info中的一个记录在product_all中有多个实例,这就是为什么它们需要分组)

edit: both queries' natural join acts on prod_id (one record in product_info can have multiple instances in product_all, which is why they need to be grouped)

推荐答案

索引在mysql中有很大的区别,花了15分钟与一组错误的索引花了.2秒与正确的,但它找到正确的平衡,这是一般的问题。自然没有一些样本数据,它真的很难说如果下面的解决方案将节省你任何时间,但在理论上应该。

Indices make a massive difference in mysql, one query that took 15 minutes with a wrong set of indices took .2 seconds with the right ones, but its finding the right balance that is generally the issue. Naturally without some sample data its really hard to say if the below solution will save you any time, but in theory it should.

要回答您的问题,我将重新设计表格如下:

To answer your questions, I would redesign the tables like so:

CREATE TABLE `product_all` ( 
`prod_id` INT( 10 ) NOT NULL, 
`ref_id` INT( 10) NOT NULL, 
`date` DATE NOT NULL , 
`buy_link` BLOB NOT NULL , 
`sale_price` FLOAT NOT NULL,
PRIMARY KEY (prod_id, ref_id) ,
INDEX date_Index (`date` ASC),
UNIQUE INDEX prod_price_Index (prod_id ASC, sale_price ASC)
) ENGINE = MYISAM ; 


CREATE TABLE `product_info` ( 
`prod_id` INT( 10 ) NOT NULL AUTO_INCREMENT, 
`prod_name` VARCHAR( 200 ) NOT NULL, 
`brand` VARCHAR( 50 ) NOT NULL, 
`retail_price` FLOAT NOT NULL, 
`category` INT( 3 ) NOT NULL, 
`gender` VARCHAR( 1 ) NOT NULL, 
`type` VARCHAR( 10 ) NOT NULL,
PRIMARY KEY (prod_id) ,
UNIQUE INDEX prod_id_name_Index (prod_id ASC, prod_name ASC),
INDEX category_Index (category ASC),
INDEX gender_Index (gender ASC)
) ENGINE = MYISAM ;

SELECT product_info.*, MIN(product_all.sale_price) as sale_price, product_all.buy_link         
FROM product_info         
NATURAL JOIN (SELECT * FROM product_all WHERE product_all.date = '2010-09-30') as product_all         
WHERE (product_info.category = 2           
AND product_info.gender = 'W' )         
GROUP BY product_all.prod_id         
ORDER BY MIN(product_all.sale_price) ASC LIMIT 13        

这里的性能提升是通过索引正在加入的主要字段并在where子句中列出。个人而言,我会用你的第一个查询,当你认为它应该执行更好。

The performance gain here is gained my indexing the main fields that are being joined upon and are featured in the where clause. Personally I would go with your first query as when you think about it that should perform better.

据我了解第一次和第二次查询中发生了什么:

As far as I understand whats happening in the first and second query:


  • 在执行
    自然联接之前,第一个查询被
    a子查询过滤,这意味着它只有
    加入结果数据,而不是
    整个表。 / li>
  • 第二个查询是加入
    整个第二个表,然后

    整行的结果行过滤回您想要的行。

一般来说,您希望在主要连接字段以及在where子句中使用最多的字段中添加索引。我还为您要定期查询的某些字段(例如prod_id_name_Index)添加了一些唯一索引。

As a rule of thumb normally you want to add indices on your major joining fields and also the fields that you use the most in where clauses. I've also put some unique indices on some of the fields that you will want to query regularly, such as prod_id_name_Index.

如果这不会提高您的性能,如果您可能会发布一些虚拟数据来玩,我可能能够得到一个更快的解决方案,我可以基准。

If this doesn't improve your performance if you could maybe post some dummy data to play with I might be able to get a faster solution that I can benchmark.

这里是一篇文章,通过索引在mysql的性能,值得一读,如果你想知道更多。

Here is an article that goes through indexing for performance in mysql, worth a read if you want to know more.

祝你好运!

编辑:你最后一个问题我错过了第一次,答案是如果你索引的主要连接字段对where的更改只会对整体性能产生轻微的影响,但是我对表进行的唯一索引应该考虑到大多数需要查询的事情。要记住的主要事情是,如果你频繁地查询或加入一个字段,那么它应该真的被索引,但是对你的顺序的轻微的查询和更改应该不会担心重新调整你的索引策略。

Your final question I missed the first time, the answer is that if your indexing the main joining fields then changes to the where will only impact the overall performance slightly, but the unique indices I've put on the tables should account for the majority of things you'll want to base queries upon. The main thing to remember is if you query or join upon a field frequently then it should really be indexed, but minor queries and changes to the order by you should just not worry about in terms of realigning your indexing strategy.

这篇关于Mysql慢查询:JOIN +多个WHERES + ORDER BY的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆