MySQL Select日期范围介于以NULL分隔的数据系列之间 [英] MySQL Select date ranges between data series separated with NULL

查看:153
本文介绍了MySQL Select日期范围介于以NULL分隔的数据系列之间的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是解决更复杂的问题

有一个数据表:

+------------+------+----------+-----------+
| date       | data | data_max | data_diff |
+------------+------+----------+-----------+
| 2017-01-02 |    2 |        2 |      NULL |
| 2017-01-03 |    4 |        4 |      NULL |
| 2017-01-04 |    1 |        4 |        -3 |
| 2017-01-05 |    3 |        4 |        -1 |
| 2017-01-06 |    1 |        4 |        -3 |
| 2017-01-07 |    4 |        4 |      NULL |
| 2017-01-08 |    5 |        5 |      NULL |
| 2017-01-09 |   -2 |        5 |        -7 |
| 2017-01-10 |    0 |        5 |        -5 |
| 2017-01-11 |   -5 |        5 |       -10 |
| 2017-01-12 |    6 |        6 |      NULL |
| 2017-01-13 |    4 |        6 |        -2 |
+------------+------+----------+-----------+

我想计算 data_diff 的最小值和最大值但分别针对每个数据子集。每个数据子集都以NULL开头(但最后一个子集可能不是以NULL结束,而是以数据结尾),我还需要每个数据子集的开始和结束日期,以后可用于计算Min,Max值。我想获取日期范围:

I want to calculate Min and Max values of data_diff but separately for each data subset. Each subset of data starts with NULL (but the last one may not end with NULL but with the data) I need also start and end date of each data subset that I can later use for calculating Min, Max values. I would like to get date ranges:

+----------------+--------------+
| diff_date_from | diff_date_to |
+----------------+--------------+
| 2017-01-04     | 2017-01-06   |
| 2017-01-09     | 2017-01-11   |
| 2017-01-13     | 2017-01-13   |
+----------------+--------------+

如果您想获取示例数据,请执行以下查询:

If you would like to get the example data here's a query:

CREATE TABLE IF NOT EXISTS `test`
(
   `date_time` DATETIME UNIQUE NOT NULL,
   `data` INT NOT NULL
)
ENGINE InnoDB;

INSERT INTO `test` VALUES
('2017-01-02', 2),
('2017-01-03', 4),
('2017-01-04', 1),
('2017-01-05', 3),
('2017-01-06', 1),
('2017-01-07', 4),
('2017-01-08', 5),
('2017-01-09', -2),
('2017-01-10', 0),
('2017-01-11', -5),
('2017-01-12', 6),
('2017-01-13', 4)

;

SELECT 
    DATE(`date_time`) AS `date`,
    `data`,
    `data_max`,
    IF(`data` < `data_max`,  - (`data_max` - `data`), NULL) 
    AS `data_diff`

FROM
    (
    SELECT 
        `date_time`,
        `data`,
        MAX(`data`) OVER (ORDER BY `date_time` ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS `data_max`
    FROM
        `test`
    ) t
;

是否可以编写单个查询来提供上述日期范围?还是必须应用某种程序或某种技巧?

Is it possible to write a single query that will provide date ranges as above? Or there must be a procedure or some sort of trick applied?

也许带有OVER的窗口函数可能会有所帮助,但我不知道如何指定当前窗口之间的窗口边界非NULL的行以及从NULL开头的行开始的先前行。这完全可行吗?

Maybe a window function with OVER could help but I'm not aware how to specify its window boundary between current row that is not NULL and preceding rows starting from a row preceded by NULL. Is this feasible at all?

RANGE 个运算符用于设置窗口边界文档
看起来很有希望:

There is RANGE operator for setting window boundary Documentation that looks promising:

开始:对于ROWS,边界是expr行在当前行之前。
对于RANGE ,边界是值等于当前行值减去expr的行;如果当前行值为NULL,则边界是该行的 peers 的边界。

PRECEDING: For ROWS, the bound is expr rows before the current row. For RANGE, the bound is the rows with values equal to the current row value minus expr; if the current row value is NULL, the bound is the peers of the row.

以及另一部分:


在10个开头和10个跟随之间的X ASC排列顺序

ORDER BY X ASC RANGE BETWEEN 10 PRECEDING AND 10 FOLLOWING

该帧从NULL开始并在NULL处停止,因此仅包含值为NULL的行

The frame starts at NULL and stops at NULL, thus includes only rows with value NULL.

不能理解只包含具有空值的行的意义。
也许在日期范围 2017-01-02 2017-01-03 的范围内,但对于 2017-01-03 2017-01-07 怎么来?

But I don't get the point of inlcuding only rows with null. Perhaps for the date range 2017-01-02 to 2017-01-03 but for 2017-01-03 to 2017-01-07 how come?

推荐答案

我一直在研究您的更复杂的问题(仍在研究答案中),但这是该问题的解决方案。考虑到您正在使用窗口函数,我假设您正在使用MySQL 8,因此也可以使用CTE:

I've been looking at your "more complex problem" (still working on an answer) but here is a solution for this problem. Given that you are using window functions I'm assuming you're using MySQL 8 and so can use CTEs as well:

WITH cte AS (SELECT DATE(`date_time`) AS `date`,
                    `data`,
                    MAX(`data`) OVER (ORDER BY `date_time` ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS `data_max`
             FROM `test`),
cte2 AS (SELECT `date`,
                `data`,
                `data_max`,
                CASE WHEN `data` < `data_max` THEN `data` - `data_max` END AS `data_diff`
         FROM cte)
SELECT `data_max`, 
       MIN(CASE WHEN `data_diff` IS NOT NULL THEN date END) AS diff_date_from,
       MAX(CASE WHEN `data_diff` IS NOT NULL THEN date END) AS diff_date_to
FROM cte2
GROUP BY `data_max`
HAVING diff_date_from IS NOT NULL

输出:

data_max    diff_date_from  diff_date_to
4           2017-01-04      2017-01-06
5           2017-01-09      2017-01-11
6           2017-01-13      2017-01-13

dbfiddle上的演示

这篇关于MySQL Select日期范围介于以NULL分隔的数据系列之间的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆