SQL:检索总和作为子选择非常慢 [英] SQL: Retrieving total sum as subselect very slow

查看:64
本文介绍了SQL:检索总和作为子选择非常慢的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图在几行中获取一些平均值和一些总和,按一天中的每个小时分组.另外,我想获取一个额外的列,在那里我没有获得每小时的总和(在分组时获取),但是我想在该特定日期之前获取所有行的总和.SQL 语句在下面发布.

I am trying to fetch some averages and some sums over several rows, grouping by each hour of the day. Plus I want to fetch an additional column, where I don't get the sums for each hour (which is fetched when grouping), but where I want to fetch the total sum over all rows until that specific date. The SQL-statement is posted below.

我现在的问题是,在约 25k 行的 MySQL 数据库上执行查询需要大约 8 秒(CPU i5/8GB RAM).我发现子选择 (... AS 'rain_sum') 使它非常慢.我现在的问题是:我的思考方式是否过于复杂?有没有更简单的方法可以获得与我从下面的查询中得到的结果相同的结果?

My problem is now, that executing the query on a MySQL database over ~25k rows takes about 8 seconds (CPU i5/8GB RAM). I identified that the subselect (... AS 'rain_sum') makes it very slow. My question is now: Do I think in a too complex way? Is there an easier way to get the same results I get from the query below?

SELECT
    `timestamp_local` AS `date`,
    AVG(`one`) AS `one_avg`,
    AVG(`two`) AS `two_avg`,
    SUM(`three`) AS `three_sum`,
    (SELECT SUM(`b`.`three`)
        FROM `table` AS `b`
        WHERE `b`.`timestamp_local` <= SUBDATE(`a`.`timestamp_local`, INTERVAL -1 SECOND)
        LIMIT 0,1) AS `three_sum`
FROM  `table` AS  `a`
GROUP BY
    HOUR( `a`.`timestamp_local` ),
    DAY( `a`.`timestamp_local` ),
    MONTH( `a`.`timestamp_local` ),
    WEEK( `a`.`timestamp_local` ),
    YEAR( `a`.`timestamp_local` )
ORDER BY `a`.`timestamp_local` DESC
LIMIT 0, 24;

推荐答案

不是对所有这些字段进行分组,而是一个更简单(更快)的解决方案(来自 此处)可能是:

Rather than grouping on all those fields, a simpler (and faster) solution (from here) may be:

GROUP BY UNIX_TIMESTAMP(timestamp_local)/3600

我无法想象您的查询会返回您想要的结果(如果我正确理解您的要求).我理解您的要求,因为当给定小时没有行时,您想计算所有行的总和,小时 <;那个小时.MySQL 不会选择空分组(对于子查询部分).

I can't imagine that your query returns the results you want (if I understand your requirements correctly). I understand your requirements as, when there are no rows for a given hour, you want to calculate the sum of all rows with hour < that hour. MySQL won't select empty groupings (for the sub-query part).

据我所知,在 MySQL 中没有简单有效的方法可以做到这一点,我建议创建一个临时表,其中包含您正在查看的范围内的所有可能的分组值(可能带有循环).您可能可以将这个表预先设置几年,并可能根据需要添加行.然后你就可以离开加入这张桌子和你的桌子了.

There's no easy efficient way to do this in MySQL that I know of, I would suggest creating a temporary table with all possible grouping values in the range that your looking at (probably with a loop). You can probably set this table up beforehand for a few years, and possibly add rows as required. Then you can just left join this table and your table.

如果您使用的是 MSSQL,您可以使用递归 CTE,尽管这可能会很慢.看看这个或谷歌"mysql cte"用于 MySQL 替代品.使用递归执行此操作的方法是(左)在同一表上重复连接 HOUR = HOUR+1 直到获得非 NULL 值,然后停止.对于这些中的每一个,您将向后计算总和.

If you were using MSSQL, you could've used a recursive CTE, though this would probably have been very slow. Look at this or google "mysql cte" for MySQL alternatives. The way to do this with recursion is to (left) join on the same table repeatedly for HOUR = HOUR+1 until you get a non-NULL value, then stop. For each of these you will calculate the sum backwards.

这篇关于SQL:检索总和作为子选择非常慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆