mysql:从两者中选择最有效的查询 [英] mysql:choosing the most efficient query from the two

查看:31
本文介绍了mysql:从两者中选择最有效的查询的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这两个 mysql 查询产生完全相同的结果,但查询 A 是一个简单的联合,它具有嵌入在各个查询中的 where postType 子句,而查询 B 具有应用于虚拟表的外部选择的相同 where 子句,即单个查询结果的联合.我担心如果有很多行,查询 B 中的虚拟表 sigma 可能会无缘无故变得太大,但是我有点困惑,因为 order by 将如何用于查询 A ;它是否也不必制作虚拟表或类似的东西来对结果进行排序.所有可能都取决于 order by 如何为工会工作?如果联合的 order by 也在制作临时表;那么查询 A 几乎等同于资源中的查询 B(与查询 A 相比,在我们的系统中实现查询 B 会容易得多)?请以任何可能的方式指导/建议,谢谢

查询 A

<块引用>

SELECT `t1`.*, `t2`.*从`t1`内部连接`t2`开启`t1`.websiteID= `t2`.ownerIDAND `t1`.authorID= `t2`.authorIDAND `t1`.authorID=1559 AND `t1`.postType="simplePost"联盟选择`t1`.*FROM `t1` where websiteID=1559 AND postType="simplePost"ORDER BY postID 限制 0,50

查询 B

<块引用>

选择 * from (选择`t1`.*,`t2`.*从`t1`内部连接`t2`开启`t1`.websiteID= `t2`.ownerIDAND `t1`.authorID= `t2`.authorIDAND `t1`.authorID=1559联盟选择`t1`.*来自`t1`,其中 websiteID=1559)作为 sigma where postType="simplePost" ORDER BY postID limit 0,50

解释查询 A

<块引用>

id type table type possible_keys keys key_len ref rows Extra1 PRIMARY t2 ref userID userID 4 const 11 PRIMARY t1 ref authorID authorID 4 const 2 Usingwhere2 UNION t1 ref websiteID websiteID 4 const 9 UsingwhereNULL UNIONRESULT <union1,2>ALL NULL NULL NULL NULL NULL 使用文件排序

解释查询 B

<块引用>

id select_type table type possible_keys key key_len ref rows Extra1 PRIMARY <derived2>ALL NULL NULL NULL NULL 10 使用 where;使用文件排序2 DERIVED t2 ref userID userID 4 12 DERIVED t1 ref authorID authorID 4 2 使用 where3 UNION t1 ref websiteID websiteID 4 9NULL联合结果全部 NULL NULL NULL NULL NULL

解决方案

毫无疑问,第 1 版 - 在联合的每一侧单独使用 where 子句 - 会更快.让我们看看为什么版本 - 在联合结果上的 where 子句 - 更糟糕:

  • 数据量:联合结果中总会有更多的行,因为返回的行的条件较少.这意味着更多的磁盘 I/O(取决于索引)、更多的临时存储来保存行集,这意味着更多的处理时间
  • 重复扫描:如果可以在初始扫描期间处理,则必须再次扫描联合的整个结果以应用条件.这意味着对行集进行双重处理,虽然可能在内存中,但仍然是额外的工作.
  • 索引不用于联合结果的 where 子句.如果您在外键字段 postType上有索引,则不会使用它

如果您想要获得最佳性能,请使用 UNION ALL,它将行直接传递到结果中而没有开销,而不是 UNION,后者删除重复项(通常通过排序)并且可能很昂贵,根据您的评论是不必要的

定义这些索引并使用版本 1 以获得最佳性能:

在 t1(authorID, postType) 上创建索引 t1_authorID_postType;在 t1(websiteID, postType) 上创建索引 t1_websiteID_postType;

Both of these mysql queries produce exactly the same result but query A is a simple union and it has the where postType clause embedded inside individual queries whereas query B has the same where clause applied to the external select of the virtual table which is a union of individual query results. I am concerned that the virtual table sigma from query B might get too large for no good reason if there are a lot of rows but then I am bit confused because how would the order by work for query A ; would it also not have to make a virtual table or something like that for sorting results. All may depend on how order by works for a union ? If order by for a union is also making a temp table ; would then query A almost equate to query B in resources(it will be much easier for us to implement query B in our system compared to query A)? Please guide/advise in any way possible, thanks

Query A

SELECT `t1`.*, `t2`.*
            FROM `t1` INNER JOIN `t2` ON
            `t1`.websiteID= `t2`.ownerID
            AND `t1`.authorID= `t2`.authorID
            AND `t1`.authorID=1559 AND `t1`.postType="simplePost"
UNION
            SELECT `t1`.*
             FROM `t1` where websiteID=1559 AND postType="simplePost" 
ORDER BY postID limit 0,50

Query B

Select * from (
SELECT `t1`.*,`t2`.*
            FROM `t1` INNER JOIN `t2` ON
             `t1`.websiteID= `t2`.ownerID
              AND `t1`.authorID= `t2`.authorID
            AND `t1`.authorID=1559
UNION
            SELECT `t1`.*
             FROM `t1` where websiteID=1559
)
As sigma  where postType="simplePost" ORDER BY postID limit 0,50

EXPLAIN FOR QUERY A

id    type                table           type    possible_keys   keys            key_len ref         rows    Extra
1     PRIMARY             t2              ref     userID          userID          4       const       1       
1     PRIMARY             t1              ref     authorID        authorID        4       const       2       Usingwhere
2     UNION               t1              ref     websiteID       websiteID       4       const       9       Usingwhere
NULL  UNIONRESULT         <union1,2>      ALL     NULL            NULL            NULL    NULL        NULL    Usingfilesort

EXPLAIN FOR QUERY B

id    select_type     table       type    possible_keys   key         key_len     ref     rows    Extra
1     PRIMARY         <derived2>  ALL     NULL            NULL        NULL        NULL    10      Using where; Using filesort
2     DERIVED         t2          ref     userID          userID          4               1   
2     DERIVED         t1          ref     authorID        authorID        4               2       Using where
3     UNION           t1          ref     websiteID       websiteID       4               9   
NULL  UNION RESULT    <union2,3>  ALL     NULL            NULL        NULL        NULL    NULL

解决方案

There is no doubt that version 1 - separate where clauses in each side of the union - will be faster. Let's look at why version - where clause over the union result - is worse:

  • data volume: there's always going to be more rows in the union result, because there are less conditions on what rows are returned. This means more disk I/O (depending on indexes), more temporary storage to hold the rowset, which means more processing time
  • repeated scan: the entire result of the union must be scanned again to apply the condition, when it could have been handled during the initial scan. This means double handling the rowset, albeit probably in-memory, still it's extra work.
  • indexes aren't used for where clauses on a union result. If you have an index over the foreign key fields and postType, it would not be used

If you want maximum performance, use UNION ALL, which passes the rows straight out into the result with no overhead, instead of UNION, which removes duplicates (usually by sorting) and can be expensive and is unnecessary based in your comments

Define these indexes and use version 1 for maximum performance:

create index t1_authorID_postType on t1(authorID, postType);
create index t1_websiteID_postType on t1(websiteID, postType);

这篇关于mysql:从两者中选择最有效的查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆