MySQL查询时间段和最大下降值与之前的最大值 [英] MySQL query time periods and value of maximum drop vs preceding max value

查看:71
本文介绍了MySQL查询时间段和最大下降值与之前的最大值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

  1. 如何编写显示最大值下降日期与其前一个最大值的日期的查询(如果有一系列值小于前一个最大值,并且有两个或多个值是最低的那么应该提供第一个最低值出现的日期)
    查询将在实时数据上执行,因此对于特定日期,仅考虑该日期及之前的所有值.

  1. How to write query that shows a date of maximum value drop vs its preceding max value (if there is a series of values that are lower than preceding max value and there are two or more values that are the lowest ones then the date of the first lowest value occurrence should be provided)
    The query will be executed on real time data so for a particular date only values that date and all before are considered.

如何编写一个查询,该查询显示系列结束之间的日期时间段,在该日期时间段内测得的数据低于其之前的最大值?
这等效于最后一个最大数据值与该值的下一个日期之间的日期时间,该日期与上一个最大值相同或更高(以先到者为准).
查询将在历史数据上执行,因此在考虑的行之前和之后的所有行均可用.

How to write a query that shows a date period between the end of series where a measured data was lower than its maximum value that preceded it?
This is an equivalent of a date period between last maximum data value and a following date of the value that is the same amount as previous maximum value or higher (whatever comes first).
The query will be executed on historical data so all rows before and after a considered row are available.

请参阅问题末尾的复制,以生成测试表和示例查询.

Please See Replicate at the end of the question to generate the test table and an example query.

我试图使用窗口函数来实现这些查询,但是我无法构建它们.我只是设法获得了当前测量数据与其之前最接近的最大值之间的差异.

I tried to use window functions to achieve these queries but I could't build them. I only managed to get difference between current data of measurement and its closest max value that preceded it.

测试数据如下:

+---------------------+------+
| date_time           | data |
+---------------------+------+
| 2017-01-02 00:00:00 |    2 |
| 2017-01-03 00:00:00 |    4 |
| 2017-01-04 00:00:00 |    1 |
| 2017-01-05 00:00:00 |    3 |
| 2017-01-06 00:00:00 |    1 |
| 2017-01-07 00:00:00 |    4 |
| 2017-01-08 00:00:00 |    5 |
| 2017-01-09 00:00:00 |   -2 |
| 2017-01-10 00:00:00 |    0 |
| 2017-01-11 00:00:00 |   -5 |
| 2017-01-12 00:00:00 |    6 |
| 2017-01-13 00:00:00 |    4 |
| 2017-01-14 00:00:00 |    6 |
+---------------------+------+

这是当前数据行与我已经拥有的上一个最大数据的差异

and this is the difference of a current data row vs prev max data I already have

+------------+------+----------+-----------+
| date       | data | data_max | data_diff |
+------------+------+----------+-----------+
| 2017-01-02 |    2 |        2 |      NULL |
| 2017-01-03 |    4 |        4 |      NULL |
| 2017-01-04 |    1 |        4 |        -3 |
| 2017-01-05 |    3 |        4 |        -1 |
| 2017-01-06 |    1 |        4 |        -3 |
| 2017-01-07 |    4 |        4 |      NULL |
| 2017-01-08 |    5 |        5 |      NULL |
| 2017-01-09 |   -2 |        5 |        -7 |
| 2017-01-10 |    0 |        5 |        -5 |
| 2017-01-11 |   -5 |        5 |       -10 |
| 2017-01-12 |    6 |        6 |      NULL |
| 2017-01-13 |    4 |        6 |        -2 |
| 2017-01-14 |    6 |        6 |      NULL |
+------------+------+----------+-----------+

这是期望的结果(问题1 ):

+---------------+----------+
| diff_max_date | diff_max |
+---------------+----------+
| 2017-01-04    |       -3 |
| 2017-01-09    |       -7 |
| 2017-01-11    |      -10 |
| 2017-01-13    |       -2 |
+---------------+----------+

请注意,第一项-3是日期2017-01-04的日期,因为这是其前一个最大值之后的第一个最小值:4日期:2017-01-03因此值:-3日期:2017-01-06被忽略.

Please note that first entry -3 is for the date 2017-01-04 because this is the first lowest value after its preceding max value: 4 date: 2017-01-03 therefore value: -3 date: 2017-01-06 is ignored.

问题1的查询适用于插入到测试表中的实时数据,因此它不希望将来有其他条目.这就是为什么应该有两个最低的数据条目值:-7日期:2017-01-09和值:-10日期:2017-01-11的原因,因为在日期2017-01-09处,值-10日期:2017-01-11是未知的.

the query for question 1 works on live data that is inserted to the test table and because of that it is not looking forward for future entries. This is reason why there should be two lowest data entries value: -7 date: 2017-01-09 and value: -10 date: 2017-01-11 because at the date 2017-01-09 a value of -10 date: 2017-01-11 was unknown.

希望的结果(问题2 )

+----------------+--------------+---------------+----------+
| diff_date_from | diff_date_to | diff_max_date | diff_max |
+----------------+--------------+---------------+----------+
| 2017-01-04     | 2017-01-06   | 2017-01-04    |       -3 |
| 2017-01-09     | 2017-01-11   | 2017-01-11    |      -10 |
| 2017-01-13     | 2017-01-13   | 2017-01-13    |       -2 |
+----------------+--------------+---------------+----------+

请注意,第二行仅是值:-10日期:2017-01-11和值:-7日期:2017-01-09,因为它不是最小值,并且对历史数据进行查询,所以整个日期都是这样范围是可用的,而不仅仅是当前日期行以及所有之前的日期行.

Please note that second row is only value: -10 date: 2017-01-11 and value: -7 date: 2017-01-09 is ignored since it is not the lowest value and the query works on historical data so whole date range is available to it and not just a current date row with all preceding ones.

查询不必作为单个查询.我可以为Q1创建专用表,例如使用它为Q2生成另一个表.或将来自Q1的数据列添加到测试表,然后为Q2生成表.但是尝试了很多次却失败了.

The queries do not need to be as a single query. It is possible to me to create dedicated tables for Q1 and for example use it to generate another table for Q2. Or add column of data from Q1 to test table and then generate table for Q2. But tried many times and failed.

查询(MySQL 8)以复制测试数据表并获取计算出的data_diff和data_max:

Query (MySQL 8) to Replicate test data table and get calculated data_diff and data_max:

CREATE TABLE IF NOT EXISTS `test`
(
   `date_time` DATETIME UNIQUE NOT NULL,
   `data` INT NOT NULL
)
ENGINE InnoDB;

INSERT INTO `test` VALUES
('2017-01-02', 2),
('2017-01-03', 4),
('2017-01-04', 1),
('2017-01-05', 3),
('2017-01-06', 1),
('2017-01-07', 4),
('2017-01-08', 5),
('2017-01-09', -2),
('2017-01-10', 0),
('2017-01-11', -5),
('2017-01-12', 6),
('2017-01-13', 4),
('2017-01-14', 6)
;

SELECT 
    DATE(`date_time`) AS `date`,
    `data`,
    `data_max`,
    IF(`data` < `data_max`,  - (`data_max` - `data`), NULL) 
    AS `data_diff`

FROM
    (
    SELECT 
        `date_time`,
        `data`,
        MAX(`data`) OVER (ORDER BY `date_time` ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS `data_max`
    FROM
        `test`
    ) t
;

也许您至少知道如何获取日期范围,并且可以通过回答

Perhaps you know at least how to get date ranges and can help me solving this problem by answering this question

推荐答案

我怀疑可以对其进行一些优化,但是这些查询应该可以为您提供所需的结果.它们共享相同的前3个CTE,它们为每个data_max生成diff_max值.在第一个查询中,我们只是寻找该值的变化(从NULL到一个值,或该值的减小),以便生成输出行.第二个查询的第4个和第5个CTE与第一个查询类似,但是我们在diff_max值中添加了RANK,因此我们可以将JOIN最小值(及其关联的日期)添加到date_diff_from和<第六CTE中的c24>值(与我对

I suspect they can be optimised somewhat but these queries should give you the results you want. They share the same first 3 CTEs which generate the diff_max value for each data_max. In the first query we just then look for a change in that value (from NULL to a value, or a decrease in the value) in order to generate the output rows. The second query's 4th and 5th CTEs are similar to the first query, but we add a RANK to the diff_max values, so we can JOIN the minimum value (with it's associated date) to the date_diff_from and date_diff_to values from the 6th CTE (which is the same as my answer to your other question).

问题1:

WITH cte AS (SELECT DATE(`date_time`) AS `date`,
                    `data`,
                    MAX(`data`) OVER (ORDER BY `date_time`) AS `data_max`
             FROM `test`),
cte2 AS (SELECT `date`,
                `data`,
                `data_max`,
                CASE WHEN `data` < `data_max` THEN `data` - `data_max` END AS `data_diff`
         FROM cte),
cte3 AS (SELECT `date`, 
                MIN(`data_diff`) OVER (PARTITION BY `data_max` ORDER BY `date`) AS `diff_max`
         FROM cte2),
cte4 AS (SELECT `date`, `diff_max`, LAG(`diff_max`) OVER (ORDER BY `date`) AS `old_diff_max`
         FROM cte3)
SELECT `date`, `diff_max`
FROM cte4
WHERE `diff_max` < `old_diff_max` OR `old_diff_max` IS NULL AND `diff_max` IS NOT NULL

输出:

date        diff_max
2017-01-04  -3
2017-01-09  -7
2017-01-11  -10
2017-01-13  -2

问题2:

WITH cte AS (SELECT DATE(`date_time`) AS `date`,
                    `data`,
                    MAX(`data`) OVER (ORDER BY `date_time`) AS `data_max`
             FROM `test`),
cte2 AS (SELECT `date`,
                `data`,
                `data_max`,
                CASE WHEN `data` < `data_max` THEN `data` - `data_max` END AS `data_diff`
         FROM cte),
cte3 AS (SELECT `data_max`, `date`, 
                MIN(`data_diff`) OVER (PARTITION BY `data_max` ORDER BY date) AS `diff_max`
         FROM cte2),
cte4 AS (SELECT `data_max`, `date`, `diff_max`, 
                LAG(`diff_max`) OVER (ORDER BY `date`) AS `old_diff_max`
         FROM cte3),
cte5 AS (SELECT `date`, `diff_max`, 
                RANK() OVER (PARTITION BY `data_max` ORDER BY `diff_max`) AS `diff_rank`
         FROM cte4
         WHERE `diff_max` < `old_diff_max` OR `old_diff_max` IS NULL AND `diff_max` IS NOT NULL),
cte6 AS (SELECT `data_max`, 
                MIN(CASE WHEN `data_diff` IS NOT NULL THEN date END) AS diff_date_from,
                MAX(CASE WHEN `data_diff` IS NOT NULL THEN date END) AS diff_date_to
         FROM cte2
         GROUP BY `data_max`
         HAVING diff_date_from IS NOT NULL)
SELECT diff_date_from, diff_date_to, `date` AS diff_max_date, `diff_max`
FROM cte6
JOIN cte5 ON cte5.date BETWEEN cte6.diff_date_from AND cte6.diff_date_to
WHERE cte5.diff_rank = 1

输出:

diff_date_from  diff_date_to    diff_max_date   diff_max
2017-01-04      2017-01-06      2017-01-04      -3
2017-01-09      2017-01-11      2017-01-11      -10
2017-01-13      2017-01-13      2017-01-13      -2

dbfiddle上的演示

这篇关于MySQL查询时间段和最大下降值与之前的最大值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆