优化查询以返回大量记录,这是避免数百个联接的一种方式.这是一个聪明的解决方案吗? [英] Optimizing a query returning a lot of records, a way to avoid hundreds of join. Is it a smart solution?

查看:129
本文介绍了优化查询以返回大量记录,这是避免数百个联接的一种方式.这是一个聪明的解决方案吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我不是很聪明的SQL,并且对如何优化查询存在以下疑问.我正在使用 MySql

I am not so int SQL and I have the following doubt about how to optimize a query. I am using MySql

我有这个数据库模式:

这是将特定商品进入特定市场的最后价格( Market_Commodity_Price_Series 表中的最后日期)的查询.

And this is the query that returns the last price (the last date into the Market_Commodity_Price_Series table) of a specific commodity into a specific market.

它包含许多联接以检索所有相关信息:

It contains a lot of join to retrieve all the related information:

SELECT MCPS.id AS series_id,
        MD_CD.market_details_id AS market_id,
        MD_CD.commodity_details_id AS commodity_id,
        MD.market_name AS market_name,
        MCPS.price_date AS price_date,
        MCPS.avg_price AS avg_price,
        CU.ISO_4217_cod AS currency, 
        MU.unit_name AS measure_unit, 
        CD.commodity_name_en,
        CN.commodity_name 
FROM Market_Commodity_Price_Series AS MCPS
INNER JOIN MeasureUnit AS MU ON MCPS.measure_unit_id = MU.id
INNER JOIN Currency AS CU ON MCPS.currency_id = CU.id
INNER JOIN MarketDetails_CommodityDetails AS MD_CD ON MCPS.market_commodity_details_id = MD_CD.id
INNER JOIN MarketDetails AS MD ON MD_CD.market_details_id = MD.id
INNER JOIN CommodityDetails AS CD ON MD_CD.commodity_details_id = CD.id
INNER JOIN CommodityName AS CN ON CD.id = CN.commodity_details_id
INNER JOIN Languages AS LN ON CN.language_id  = LN.id
WHERE MD.id = 4
AND CD.id = 4 
AND LN.id=1
ORDER BY price_date DESC LIMIT 1

我的疑问是:使用上一个查询,我将从 Market_Commodity_Price_Series 表中提取与特定商品有关的所有记录到特定市场,进行大量的联接,并根据 price_date 字段,并限制为最后一个.

My doubt is: using the previous query I am extracting all the records related to a specific commodity into a specific market from the Market_Commodity_Price_Series table, do a lot of join, ordinating these records based on the price_date field and limiting to the last one.

我认为这可能会花费很多,因为我可以拥有很多记录(因为 Market_Commodity_Price_Series 表包含每日信息).

I think that it could be expansive because I can have a lot of records (because the Market_Commodity_Price_Series table contains daily information).

此查询有效,但我认为可以通过更智能的方式完成.

This query works but I think that can be done in a smarter way.

所以我认为我可以做这样的事情:

So I thought that I can do something like this:

1)使用类似这样的查询,选择与特定商品最后进入特定市场的价格相关的记录:

1) Select the record related to the last price of a specific commodity into a specific market using a query like this:

SELECT measure_unit_id, 
        currency_id, 
        market_commodity_details_id, 
        MAX(price_date) price_date
FROM Market_Commodity_Price_Series  AS MCPS 
INNER JOIN MarketDetails_CommodityDetails AS MD_CD ON MCPS.market_commodity_details_id = MD_CD.id
WHERE MD_CD.market_details_id = 4
AND MD_CD.commodity_details_id = 4
GROUP BY measure_unit_id, currency_id, market_commodity_details_id

返回与此信息相关的单个记录:

that returns the single record related to this information:

measure_unit_id      currency_id          market_commodity_details_id price_date
--------------------------------------------------------------------------------
1                    2                    24                          05/10/2017

将此输出用作表格(我不知道确切的名称,也许是视图?)并将此表格"与 MeasureUnit,Currency,MarketDetails, CommodityDetails,CommodityName和Languages 表.

Use this output like a table (I don't know the exact name, maybe view, is it?) and join this "table" to the other required information that are into the MeasureUnit, Currency, MarketDetails, CommodityDetails, CommodityName and Languages tables.

我认为可能会更好,因为这样我可以使用 MAX(price_date)price_date 将与最新价格相关的记录仅提取到 Market_Commodity_Price_Series 而是获取所有记录,排序并限制为最新记录.

I think that it could be better because in this way I am using the MAX(price_date) price_date to extract only the record related to the latest price into the Market_Commodity_Price_Series instead obtain all the records, ordering and limiting to the latest one.

JOIN 操作中的大多数操作都在执行上一个查询返回的单个记录,而不是在我的第一个查询版本返回的所有记录上(可能是成百上千个)

Furthermore most onf the JOIN operation are doing o the single record returned by the previous query and not on all the records returned by the first version of my query (potentially they could be hundreds or thousands).

可能是一个聪明的解决方案吗?

Could be a smart solution?

如果是的话...将该查询的输出(将其视为表)与其他表联接的正确语法是什么?

If yes...what is the correct syntax to join the output of this query (considering it as a table) with the other tables?

推荐答案

您在编写高效查询方面做得相当不错.

You've done a reasonably good job of writing an efficient query.

您没有使用SELECT *,因为它会产生肿且多余的中间结果集,因此会破坏具有大量联接的查询的性能.但是您的中间结果集(您对ORDER BY应用的结果集)并没有肿.

You didn't use SELECT *, which can mess up performance in a query with lots of joins, because it generates bloated and redundant intermediate result sets. But your intermediate result set -- the one you apply ORDER BY to -- is not bloated.

您的WHERE col = val子句主要提到表的主键(我想).那挺好的.

Your WHERE col = val clauses mostly mention primary keys of tables (I guess). That's good.

您的大表Market_Commodity_Price_Series可能使用化合物覆盖指数.同样,其他一些表可能需要这种索引.但这应该是另一个问题的话题.

Your big table Market_Commodity_Price_Series could maybe use a compound covering index. Similarly, some other tables may need that kind of index. But that should be the topic of another question.

您建议的优化-排序一个主要由id值组成的中间结果集-如果您正在执行ORDER BY ... LIMIT并使用LIMIT函数来丢弃大部分结果,则将大有帮助.但是您没有这样做.

Your proposed optimization -- ordering an intermediate result set consisting mostly of id values -- would help a lot if you were doing ORDER BY ... LIMIT and using the LIMIT function to discard most of your results. But you are not doing that.

在不了解您的数据的情况下,很难提供清晰的意见.但是,如果是我,我将使用您的第一个查询.在您投入生产时(以及其他复杂的查询中),我会密切注意.当(如果不是)性能开始下降时,您可以执行EXPLAIN并找出索引表的最佳方法.您已经编写了一个可以使您的应用程序启动并运行的查询,已经做得很好.随它去吧!

Without knowing more about your data, it's hard to offer a crisp opinion. But, if it were me I'd use your first query. I'd keep an eye on it as you go into production (and on other complex queries). When (not if) performance starts to deteriorate, then you can do EXPLAIN and figure out the best way to index your tables. You've done a good job of writing a query that will get your application up and running. Go with it!

这篇关于优化查询以返回大量记录,这是避免数百个联接的一种方式.这是一个聪明的解决方案吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆