长期运行的SELECT查询的部分结果? [英] partial results from a long-running SELECT query?

查看:95
本文介绍了长期运行的SELECT查询的部分结果?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们在mysql数据库上发出一些长时间运行的查询。 (上下文是离线数据分析,而不是应用程序。)我们如何继续研究术语取决于我们在过程中获得的结果。对于我们来说,在查询完成之前,可以通过SELECT语句查看生成的(部分)结果。

We are issuing some long running queries on a mysql database. (The context is offline data analysis, not an application.) How we will proceed in research terms depends on the results we obtain along the way. It would be useful for us to be able to view (partial) results as they are generated by a SELECT statement -- before the query completes.

这是可能吗?或者我们是否等待查询完成(给定数据集的大小可能需要几个小时)才能查看在运行的第一秒钟生成的结果?

Is this possible? Or are we stuck with waiting until the query completes (which given the size of the dataset can take a couple of hours) to view results which were generated in the very first seconds it ran?

感谢您的帮助。

推荐答案

最简单的方法是使用未缓冲的查询。然后,mysql将尽可能快地开始传递数据,而不是当它准备好(和缓冲)时。根据您的查询,这可能不会有帮助。

The simplest thing to try is to use unbuffered queries. Then mysql will start delivering data as soon as it can, rather than when it has everything ready (and buffered). Depending on your query, this may not help.

要真正加快速度,您需要拆分查询。不只是使用LIMIT,这不会节省你很多时间,这取决于你的查询。例如,如果你有一个ORDER BY,几乎整个结果集将必须首先计算。您只会储存透过网路传送较少资料所需的时间。

To really speed things up, you need to break up your query. Not just using LIMIT, that's not going to save you much time depending on your query. For example, if you have an ORDER BY, pretty much the whole result set will have to be calculated first. You would only save the time it would take to deliver less data across the network.

透过筛选分割查询。如果您有一个字段已编制索引,您可以对其进行范围搜索(即自动递增),然后使用该字段将查询分解为多个查询。例如:

Split up your queries by doing a filter. If you have a field that is indexed that you can do range searches on (i.e. auto increment), then break up your query into multiple queries using that field. For example:

SELECT * FROM db WHERE field1 BETWEEN 1 AND 10000;
SELECT * FROM db WHERE field1 BETWEEN 10000 AND 20000;
...

然后您可以结合后面的结果。很多时候,像这样的多个查询将比等效的单个查询完成得更快。但是如果你有一个ORDER BY或GROUP BY,这可能是不可能的。
但是你仍然可以尝试将它拆分为更小的查询,用UNION联接它们,并在UNION上选择你的分组和排序。相信与否,这仍然可以比等价的单个查询快得多。

Then you can combine the results afterward. Many times multiple queries like this will complete faster than the equivalent single query. But if you do have an ORDER BY or GROUP BY, this may not be possible. But you could still try breaking it up into smaller queries, join them with a UNION and select on the UNION with your grouping and order by. Believe or not, this can still be much quicker than the equivalent single query. You just have to get the individual queries processing a small enough data set to make them quick.

SELECT field1, SUM(field3) field3, SUM(item_count) item_count FROM 
(
SELECT field1, SUM(field3) field3, COUNT(item) item_count FROM db WHERE field1 BETWEEN 1 AND 10000 GROUP BY field1
UNION
SELECT field1, SUM(field3) field3, COUNT(item) item_count FROM db WHERE field1 BETWEEN 10000 AND 20000 GROUP BY field1
UNION
...
) AS sub_queries GROUP BY field1

分割和征服。使用这种技术有时我将查询时间从一小时缩短到一两分钟。

Divide and conquer. Using this technique I've sometimes reduced query times from an hour down to a minute or two.

这篇关于长期运行的SELECT查询的部分结果?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆