MySQL查询优化-加入? [英] MySQL Query Optimisation - JOIN?

查看:126
本文介绍了MySQL查询优化-加入?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

一个适合您的MySQL专家:-)

One for all you MySQL experts :-)

我有以下查询:

SELECT o.*, p.name, p.amount, p.quantity 
FROM orders o, products p 
WHERE o.id = p.order_id AND o.total != '0.00' AND DATE(o.timestamp) BETWEEN '2012-01-01' AND '2012-01-31' 
ORDER BY o.timestamp ASC

  • 订单表= 80,900行
  • 产品表= 125,389行
  • o.id和p.order_id已编入索引
  • 查询大约需要6秒钟才能完成-太长了.我正在寻找一种优化它的方法,可能使用临时表或其他类型的联接.恐怕我对这两个概念的理解还很有限.

    The query takes about 6 seconds to complete - which is way too long. I am looking for a way to optimize it, possibly with temporary tables or a different type of join. I'm afraid my understanding of both of these concepts is pretty limited.

    有人可以建议我优化此查询的方法吗?

    Can anyone suggest a way for me to optimize this query?

    推荐答案

    首先,我将使用其他样式的语法. ANSI-92已经使用了20年,许多RDBMS实际上建议不要使用您曾经使用的符号.

    First, I would use a different style of syntax. ANSI-92 has had 20 years to bed in, and many RDBMS actually recommend not using the notation you have used. It's not going to make a difference in this case, but it really is very good practice for a host of reasons (that I'll let you investigate and make a decision on yourself).

    最终答案和示例语法:

    Final answer, and example syntax:

    SELECT
      o.*, p.name, p.amount, p.quantity  
    FROM
      orders
    INNER JOIN
      products
        ON orders.id = products.order_id 
    WHERE
          orders.timestamp >= '2012-01-01'
      AND orders.timestamp <  '2012-02-01'
      AND orders.total     != '0.00' 
    ORDER BY
      orders.timestamp ASC
    

    由于orders表是您要进行初始过滤的表,因此这是开始考虑优化的一个好地方.

    As the orders table is the one you are making the initial filtering on, that's a very good place to start looking at optimisation.


    使用DATE(o.timestamp) BETWEEN x AND y,您可以成功获取一月的所有日期和时间.但这需要在orders中的 每行 上调用DATE()函数(类似于RBAR的意思). RDBMS无法透视该函数以仅知道如何避免浪费时间.取而代之的是,我们需要进行优化,方法是重新排列数学,以使我们不需要要过滤的字段上的函数.

    With DATE(o.timestamp) BETWEEN x AND y you succeed in getting all dates and time in January. But that requires calling the DATE() function on every single row in the orders table (similar to what RBAR means). The RDBMS can't see through the function to just know how to avoid wasting time. Instead we need to do that optimisation, by re-arranging the maths to not need the function on the field we are filtering.

        orders.timestamp >= '2012-01-01'
    AND orders.timestamp <  '2012-02-01'
    

    此版本允许优化程序知道您想要一个彼此连续的日期区域.这称为范围搜寻.它可以使用索引非常快速地找到适合该范围的第一个记录和最后一个记录,然后挑选出介于两者之间的每条记录.这样就避免了检查所有不适合的记录,甚至避免了检查范围中间的所有记录.只需找出边界即可.

    This version allows the optimiser to know that you want a block of dates that are all sequential with each other. It's called a range-seek. It can use an index to very quickly find the first record and last record that fit that range, then pick out every record in between. That avoids checking all the records that don't fit, and even avoids checking all the records in the middle of the range; only the boundaries need to be sought out.

    假定所有记录按日期排序,并且优化器可以看到该记录.为此,您需要一个索引.考虑到这一点,您似乎可以使用两个基本的覆盖索引:
    -(id, timestamp)
    -(timestamp, id)

    That assumes all the records are ordered by date, and that the optimiser can see that. To do so you need an index. With that in mind there seem to be two basic covering indexes that you could use:
    - (id, timestamp)
    - (timestamp, id)

    第一个是我看到人们使用最多的东西.但这迫使优化器分别对每个id执行timestamp范围搜索.而且由于每个id可能都有不同的timestamp值,所以您一无所获.

    The first is what I see people use the most. But that forces the optimiser to do the timestamp range-seek for each id separately. And since every id likely has a different timestamp value, you've gained nothing.

    第二个索引是我的建议.

    The second index is what I recommend.

    现在,优化器可以非常快地完成查询的这一部分...

    Now, the optimiser can fullfill this part of your query, exceptionally quickly...

    SELECT
      o.*
    FROM
      orders
    WHERE
          orders.timestamp >= '2012-01-01'
      AND orders.timestamp <  '2012-02-01'
    ORDER BY
      orders.timestamp ASC
    

    实际上,即使ORDER BY也已使用建议的索引进行了优化.您已经按照想要输出数据的顺序进行了操作.加入后无需重新排序所有内容.

    As it happens, even the ORDER BY has been optimised with the suggested index. It's already in the order that you want the data to be output. There is no need to re-sort everything after the join.


    然后,要满足total != '0.00'要求,仍会检查范围内的每一行.但是您已经将范围缩小得太多,以至于这可能很好. (我不会介绍它,但是您可能会发现无法在MySQL中使用索引来优化此 timestamp范围搜索.)

    Then, to fullfill the total != '0.00' requirement, every row in your range is still checked. But you've already narrowed the range down so much that this will probably be fine. (I wont go in to it, but you will likely find it impossible to use indexes in MySQL to optimise this and the timestamp range-seek.)

    然后,您就可以加入.这已经通过您已经具有(products.order_id)的索引进行了优化.对于上面摘录中选择的每条记录,优化器都可以进行索引查找并非常快速地识别出匹配的记录.

    Then, you have your join. That's optimised by an index you already have (products.order_id). For every record picked out by the snippet above, the optimiser can do an index seek and very quickly identify the matching record(s).


    在所有情况下,这全部假设每个订单行都有一个或多个产品行.例如,如果只有极少数几个订单具有任何产品行,则首先选择感兴趣的产品行可能会更快;从本质上看,连接是按相反的顺序进行的.

    This all assumes that, in the vast majority of cases, every order row has one or more product rows. If, for example, only a very select few orders had any product rows, it may be faster to pick out the product rows of interest first; essentially looking at the joins happening in reverse order.

    优化器实际上是为您做出决定的,但是要知道这样做是很方便的,然后提供您估计对它最有用的索引.

    The optimiser actually makes that decision for you, but it's handy to know that it's doing that, then provide the indexes you estimate will be most useful to it.

    您可以检查解释计划以查看是否正在使用索引.如果不是,您的帮助尝试将被忽略.可能是因为数据的统计数据暗示了连接的不同顺序更好.如果是这样,则可以提供索引来代替该连接顺序.

    You can check the explain plan to see if the indexes are being used. If not, your attempt to help was ignored. Probably because of the statistics of the data implying a different order of joining was better. If so you can then provide indexes to help that order of joins instead.

    这篇关于MySQL查询优化-加入?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆