mysql用以前的值生成缺失的日期 [英] mysql generate missing dates with previous value

查看:33
本文介绍了mysql用以前的值生成缺失的日期的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

下面是一个包含稀疏日期的 mysql 表.

Below is a mysql table which has sparse dates.

col dt_id  value
A1 2018-05-28 30
A1 2018-05-30 20
A1 2018-05-31 50
A1 2018-06-01 50
A1 2018-06-04 80
A1 2018-06-05 50

输出应该类似于下面的内容,其中缺少的日期与最后一个值一起填充.

The output should be something like below where missing dates are populated along with the last value.

col dt_id  value
A1 2018-05-28 30
A1 2018-05-29 30
A1 2018-05-30 20
A1 2018-05-31 50
A1 2018-06-01 50
A1 2018-06-02 50
A1 2018-06-03 50
A1 2018-06-04 80
A1 2018-06-05 50

这里生成了以下内容.

A1 2018-05-29 30
A1 2018-06-02 50
A1 2018-06-03 50

我知道使用 last_value() over (partition by..) 的 oracle 解决方案,但由于这是 mysql,所以它有点棘手.

I know solutions with oracle using last_value() over (partition by.., but since this is mysql, its a bit tricky.

这是我尝试过的:

创建时间表并填充数据:

create a time table and populate with data:

CREATE TABLE `time_table` (date_id date not null);
create table ints ( i tinyint ); insert into ints values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9); 

insert into time_table (date_id) select date('2016-09-01')+ interval a.i*10000 + b.i*1000 + c.i*100 + d.i*10 + e.i day 
from ints a 
join ints b 
join ints c 
join ints d 
join ints e 
where (a.i*10000 + b.i*1000 + c.i*100 + d.i*10 + e.i) <= 11322 order by 1;

select * from time_table limit 10;
+------------+
| date_id    |
+------------+
| 2018-09-22 |
| 2018-09-21 |
| 2018-09-20 |
| 2018-09-19 |
| 2018-09-18 |
| 2018-09-17 |
| 2018-09-16 |
| 2018-09-15 |
| 2018-09-14 |
| 2018-09-13 |
+------------+

Here is the data for the balance table:
Here is the data
+------+------------+-------+
| A1   | 2018-05-28 |    30 |
| A1   | 2018-05-30 |    20 |
| A1   | 2018-05-31 |    50 |
| A1   | 2018-06-01 |    50 |
| A1   | 2018-06-04 |    80 |
| A1   | 2018-06-05 |    50 |
| B1   | 2018-05-28 |    30 |
| B1   | 2018-05-30 |    20 |
| B1   | 2018-05-31 |    50 |
| B1   | 2018-06-01 |    50 |
| B1   | 2018-06-04 |    80 |
| B1   | 2018-06-05 |    50 |
| C1   | 2018-05-28 |    30 |
| C1   | 2018-05-30 |    20 |
| C1   | 2018-05-31 |    50 |
| C1   | 2018-06-01 |    50 |
| C1   | 2018-06-04 |    80 |
| C1   | 2018-06-05 |    50 |
| D1   | 2018-06-28 |    30 |
| D1   | 2018-07-02 |    20 |
| D1   | 2018-07-04 |    50 |
| D1   | 2018-07-08 |    80 |
| D1   | 2018-07-19 |    50 |
+------+------------+-------+


mysql> select b.id, ab.id, tt.`date_id` as cal_date, b.`mx` as ex_date, val
    -> from time_table tt
    -> inner join (select id, min(date_id) mi, max(date_id) mx from balance group by id) b
    -> on tt.`date_id` >= b.`mi`
    -> and tt.`date_id` <= b.mx
    -> left join (select id, date_id, sum(value) val from balance group by id, date_id) ab
    -> on ab.id = b.id and tt.`date_id` = ab.date_id
    -> order by cal_date;
+------+------+------------+------------+------+
| id   | id   | cal_date   | ex_date    | val  |
+------+------+------------+------------+------+
| A1   | A1   | 2018-05-28 | 2018-06-05 |   30 |
| A1   | NULL | 2018-05-29 | 2018-06-05 | NULL |
| A1   | A1   | 2018-05-30 | 2018-06-05 |   20 |
| A1   | A1   | 2018-05-31 | 2018-06-05 |   50 |
| A1   | A1   | 2018-06-01 | 2018-06-05 |   50 |
| A1   | NULL | 2018-06-02 | 2018-06-05 | NULL |
| A1   | NULL | 2018-06-03 | 2018-06-05 | NULL |
| A1   | A1   | 2018-06-04 | 2018-06-05 |   80 |
| A1   | A1   | 2018-06-05 | 2018-06-05 |   50 |
+------+------+------------+------------+------+

推荐答案

对于 MySQL 8:

For MySQL 8:

with recursive rcte(dt_id, col, value) as (
  (
    select dt_id, col, value
    from mytable
    order by dt_id
    limit 1
  )
  union all
  select r.dt_id + interval 1 day
       , coalesce(t.col, r.col)     
       , coalesce(t.value, r.value)
  from rcte r
  left join mytable t on t.dt_id = r.dt_id + interval 1 day
  where r.dt_id < (select max(dt_id) from mytable)
)
select r.col, r.dt_id, r.value
from rcte r
order by r.dt_id

db-fiddle

递归查询将逐行构建日期,从第一个日期开始到最后一个日期.value(和 col)取自原始表,该表在日期上保持连接.如果原始表没有日期行,则取递归中最后一行的值.

The recursive query will build row by row incrementing the date starting from the first date until the last. The value (and col) is taken from the original table, which is left joined on date. If the original table doesn't have a row for a date, the value of the last row in the recursion is taken instead.

对于旧版本,您可以使用日历表和左连接 ON 子句中的子查询来获取最后存在的值:

For older versions you can use your calendar table and a subquery in the left joins ON clause to get last existing values:

select b.col, c.date_id, b.value
from time_table c
left join balance b on b.dt_id = (
  select max(dt_id)
  from balance b1
  where b1.dt_id <= c.date_id
)
where c.date_id >= (select min(dt_id) from balance)
  and c.date_id <= (select max(dt_id) from balance)

db-fiddle

由于问题已更改:

select b.col, c.date_id, b.value
from (
  select col, min(dt_id) as min_dt, max(dt_id) as max_dt
  from balance
  group by col
) i
join time_table c
  on  c.date_id >= i.min_dt
  and c.date_id <= i.max_dt
left join balance b
  on  b.col = i.col
  and b.dt_id = (
    select max(dt_id)
    from balance b1
    where b1.dt_id <= c.date_id
      and b1.col = i.col
)
order by b.col, c.date_id

db-fiddle

确保您在 (col, dt_id) 上有一个索引.在最好的情况下,它将是主键.time_table 中的 date_id 也应该被索引或者是主键.

Make sure you have an index on (col, dt_id). In best case it would be the primary key. date_id in the time_table should also be indexed or the primary key.

这篇关于mysql用以前的值生成缺失的日期的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆