找到工作日的缺失条目,并用最近日期的值填充该行 [英] find the missing entries for the working days and fill the row with the values from the closest date

查看:35
本文介绍了找到工作日的缺失条目,并用最近日期的值填充该行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

问题分为两部分.如何检查我的数据库中缺少哪些工作日,如果缺少某些工作日,则添加它们并用最近日期的值填充该行.

The problem splits into two parts. How to check which working days are missing from my database, if some are missing then add them and fill the row with the values from the closest date.

第一部分,检查并找到日期.我应该使用下面示例中的间隙方法吗?

First part, check and find the days. Should i use a gap approach like in the example below?

SELECT t1.col1 AS startOfGap, MIN(t2.col1) AS endOfGap  
   FROM  
   (SELECT col1 = theDate + 1  FROM sampleDates tbl1  
      WHERE NOT EXISTS(SELECT * FROM sampleDates tbl2  
                      WHERE tbl2.theDate = tbl1.theDate + 1) 
      AND theDate <> (SELECT MAX(theDate) FROM sampleDates)) t1 
   INNER JOIN  
   (SELECT col1 = theDate - 1  FROM sampleDates tbl1  
      WHERE NOT EXISTS(SELECT * FROM sampleDates tbl2  
                      WHERE tbl1.theDate = tbl2.theDate + 1) 
      AND theDate <> (SELECT MIN(theDate) FROM sampleDates)) t2  
   ON t1.col1 <= t2.col1 
   GROUP BY t1.col1; 

然后我需要查看哪个日期与我丢失的日期最接近,并用最近的值填充新插入的日期(丢失的日期).前段时间,我想出了一些方法来从一行中获得最接近的值,但这次我需要调整它以检查向下和向上.

Then i need to see which is the closest date to the one i was missing and fill the new inserted date (the one which was missing) with the values from the closest. Some time ago, I came up with something to get the closest value from a row, but this time i need to adapt it to check both down and upwards.

SELECT
t,A, C,Y,
COALESCE(Y, 
            (SELECT TOP (1) Y  
            FROM tableT  AS p2 
            WHERE
                  p2.Y IS NOT NULL 
                  AND p2.[t] <= p.[t] and p.C = p2.C
 ORDER BY p2.[t] DESC)) as 'YNew'
FROM tableT AS p
order by c, t

如何将它们合二为一?

谢谢

预期结果

  Date          1mA 
20.12.2012    0.152
21.12.2012    0.181 
22 weekend so it's skipped (they are skipped automatically)  
23 weekend -,- 
24 missing  
25 missing 
26 missing
27.12.2012    0.173
28.12.2012    0.342


  Date          1mA 
20.12.2012    0.152
21.12.2012    0.181 
22 weekend so it's skipped (they are skipped automatically)  
23 weekend    0.181
24 missing    0.181
25 missing    0.181
26 missing    0.173
27.12.2012    0.173
28.12.2012    0.342

因此,24,25,26 甚至不存在空值.他们根本不在那里.

So, 24,25,26 are not even there with null values. They are simply not there.

编辑 2:为了取最接近的值,让我们考虑一下我一直在上面看的场景.所以当它丢失时总是返回 1.

EDIT 2: For taking the closest value, let's consider the scenario in which i'm always looking above. So always going back 1 when it's missing.

Date          1mA 
    20.12.2012    0.152
    21.12.2012    0.181 
    22 weekend so it's skipped (they are skipped automatically)  
    23 weekend    0.181
    24 missing    0.181
    25 missing    0.181
    26 missing    0.181 
    27.12.2012    0.173
    28.12.2012    0.342

推荐答案

对于这些类型的查询,您可以通过创建包含您需要测试的每个日期的日历表获得显着的性能优势.(如果您熟悉术语维度表",这只是一个用于枚举每个感兴趣日期的表格.)

For these types of query you gain significant performance benefits from creating a calendar table containing every date you'll ever need to test. (If you're familiar with the term "dimension tables", this is just one such table to enumerate every date of interest.)

此外,整个查询可以变得非常简单.

Also, the query as a whole can become significantly simpler.

SELECT
   cal.calendar_date   AS data_date,
   CASE WHEN prev_data.gap <= next_data.gap
        THEN prev_data.data_value
        ELSE COALESCE(next_data.data_value, prev_data.data_value)
   END
       AS data_value
FROM
    calendar   AS cal
OUTER APPLY
(
    SELECT TOP(1)
        data_date,
        data_value,
        DATEDIFF(DAY, data_date, cal.calendar_date)   AS gap
    FROM
        data_table
    WHERE
        data_date <= cal.calendar_date
    ORDER BY
        data_date DESC
)
   prev_data
OUTER APPLY
(
    SELECT TOP(1)
        data_date,
        data_value,
        DATEDIFF(DAY, cal.calendar_date, data_date)   AS gap
    FROM
        data_table
    WHERE
        data_date >  cal.calendar_date
    ORDER BY
        data_date ASC
)
   next_data
WHERE
   cal.calendar_date BETWEEN '2015-01-01' AND '2015-12-31'
;

编辑以不同的要求回复您的评论

EDIT Reply to your comment with a different requirement

始终获得上面的值"更容易,并且将这些值插入到表中也很容易...

To always get "the value above" is easier, and to insert those values in to a table is easy enough...

INSERT INTO
    data_table
SELECT
   cal.calendar_date,
   prev_data.data_value
FROM
    calendar   AS cal
CROSS APPLY
(
    SELECT TOP(1)
        data_date,
        data_value
    FROM
        data_table
    WHERE
        data_date <= cal.calendar_date
    ORDER BY
        data_date DESC
)
   prev_data
WHERE
       cal.calendar_date BETWEEN '2015-01-01' AND '2015-12-31'
   AND cal.calendar_date <> prev_data.data_date
;

注意:您可以添加 WHERE prev_data.gap >0 到上面更大的查询,只获取没有数据的日期.

Note: You could add WHERE prev_data.gap > 0 to the bigger query above to only get dates that don't already have data.

这篇关于找到工作日的缺失条目,并用最近日期的值填充该行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆