您可以索引子查询吗? [英] Can you index subqueries?

查看:154
本文介绍了您可以索引子查询吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个表格和一个如下所示的查询.有关工作示例,请参见 SQL小提琴.

I have a table and a query that looks like below. For a working example, see this SQL Fiddle.

SELECT o.property_B, SUM(o.score1), w.score
FROM o
INNER JOIN 
(
    SELECT o.property_B, SUM(o.score2) AS score FROM o GROUP BY property_B
) w ON w.property_B = o.property_B
WHERE o.property_A = 'specific_A'
GROUP BY property_B;

使用我的真实数据,此查询需要27秒.但是,如果我首先将w创建为临时表和索引property_B,则总共需要花费大约1秒钟的时间.

With my real data, this query takes 27 seconds. However, if I first create w as a temporary Table and index property_B, it all together takes ~1 second.

CREATE TEMPORARY TABLE w AS
SELECT o.property_B, SUM(o.score2) AS score FROM o GROUP BY property_B;

ALTER TABLE w ADD INDEX `property_B_idx` (property_B);

SELECT o.property_B, SUM(o.score1), w.score
FROM o
INNER JOIN w ON w.property_B = o.property_B
WHERE o.property_A = 'specific_A'
GROUP BY property_B;

DROP TABLE IF EXISTS w;

是否有一种方法可以结合这两个查询中的最佳方法? IE.单个查询具有子查询中索引编制的速度优势?

Is there a way to combine the best of these two queries? I.e. a single query with the speed advantages of the indexing in the subquery?

在Mehran回答以下之后,我在

After Mehran's answer below, I read this piece of explanation in the MySQL documentation:

从MySQL 5.6.3开始,优化器可以更有效地处理FROM子句中的子查询(即派生表):

As of MySQL 5.6.3, the optimizer more efficiently handles subqueries in the FROM clause (that is, derived tables):

...

对于FROM子句中的子查询要求实现的情况,优化器可以通过在实现的表中添加索引来加快对结果的访问.如果这样的索引允许ref访问表,则可以大大减少查询执行期间必须读取的数据量.考虑以下查询:

For cases when materialization is required for a subquery in the FROM clause, the optimizer may speed up access to the result by adding an index to the materialized table. If such an index would permit ref access to the table, it can greatly reduce amount of data that must be read during query execution. Consider the following query:

SELECT * FROM t1
  JOIN (SELECT * FROM t2) AS derived_t2 ON t1.f1=derived_t2.f1;

如果这样做可以允许对最低成本的执行计划使用ref访问,则优化器会在f1的f1列上构造索引.添加索引后,优化器可以将物化派生表与具有索引的普通表相同,并且从生成的索引中也可以得到类似的好处.与没有索引的查询执行成本相比,索引创建的开销可以忽略不计.如果ref访问会比其他访问方法带来更高的成本,则不会创建索引,优化器也不会丢失任何内容.

The optimizer constructs an index over column f1 from derived_t2 if doing so would permit the use of ref access for the lowest cost execution plan. After adding the index, the optimizer can treat the materialized derived table the same as a usual table with an index, and it benefits similarly from the generated index. The overhead of index creation is negligible compared to the cost of query execution without the index. If ref access would result in higher cost than some other access method, no index is created and the optimizer loses nothing.

推荐答案

首先,您需要知道创建临时表绝对是可行的解决方案.但是在没有其他选择的情况下,这是不正确的!

First of all you need to know that creating a temporary table is absolutely a feasible solution. But in cases no other choice is applicable which is not true here!

对于您来说,您可以像 FrankPl 所指出的那样轻松地增强查询,因为您的子查询和主查询都按同一字段分组.因此,您不需要任何子查询.为了完整起见,我将复制并粘贴FrankPl的解决方案:

In your case, you can easily boost your query as FrankPl pointed out because your sub-query and main-query are both grouping by the same field. So you don't need any sub-queries. I'm going to copy and paste FrankPl's solution for the sake of completeness:

SELECT o.property_B, SUM(o.score1), SUM(o.score2)
FROM o
GROUP BY property_B;

但是,这并不意味着不可能遇到希望为子查询建立索引的情况.在这种情况下,您有两种选择,一种是使用自己指出的临时表来保存子查询的结果.该解决方案具有优势,因为它很长时间以来一直受到MySQL的支持.如果涉及大量数据,那是不可行的.

Yet it doesn't mean it's impossible to come across a scenario in which you wish you could index a sub-query. In which cases you've got two choices, first is using a temporary table as you pointed out yourself, holding the results of the sub-query. This solution is advantageous since it is supported by MySQL for a long time. It's just not feasible if there's a huge amount of data involved.

第二种解决方案是使用 MySQL 5.6或更高版本.在最新版本的MySQL中,合并了新算法,因此在子查询中使用的表上定义的索引也可以在子查询之外使用.

The second solution is using MySQL version 5.6 or above. In recent versions of MySQL new algorithms are incorporated so an index defined on a table used within a sub-query can also be used outside of the sub-query.

[UPDATE]

对于问题的编辑版本,我建议以下解决方案:

For the edited version of the question I would recommend the following solution:

SELECT o.property_B, SUM(IF(o.property_A = 'specific_A', o.score1, 0)), SUM(o.score2)
FROM o
GROUP BY property_B
HAVING SUM(IF(o.property_A = 'specific_A', o.score1, 0)) > 0;

但是您需要进行HAVING部分的工作.您可能需要根据实际问题进行更改.

But you need to work on the HAVING part. You might need to change it according to your actual problem.

这篇关于您可以索引子查询吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆