基于每日传入数据的实体的SQL构造和更新日期间隔 [英] SQL construct and renew date intervals for entities based on daily incoming data

查看:78
本文介绍了基于每日传入数据的实体的SQL构造和更新日期间隔的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我每天都有以下格式的传入数据: (id, dat, state, val)其中,id是唯一实体,dat是日期(即ISO'yyyy-MM-dd'),state是实体的状态,val是该状态下的实体的值.

我需要将其转换为以下格式的数据:(id, state, val, dat1, dat2, month)其中dat1dat2是日期戳,在该日期戳之间,实体id的状态为state,值在一个月内为val.

我正在努力寻找一种正确的方法来收集和保存这种格式的数据.我了解解决方案必须满足以下要求:

1)如果id在新的每日数据(即2017年1月27日)中将其statest1更改为st2,则应该有一个封闭期的行,如下所示:

(id=1, state=st1, val=2.5, dat1='2017-01-15', dat2='2017-01-26', month='2017-01-01')

和状态st2的一行,像这样:

(id=1, state=st2, val=5.5, dat1='2017-01-27', dat2='2017-01-31', month='2017-01-01')

2)如果在2017年2月14日有一个新的表id,则应该插入这样的行:

(id=2, state=st3, val=1.5, dat1='2017-02-01', dat2='2017-02-28', month='2017-02-01'), 因此,如果每日数据中有一个新的ID,而在以前的期间中不存在此ID,我们会将他的出生日期作为该期间的开始,而将结束日期保留为月底,直到该ID的状态在内部发生变化2月.

3)设计表的方式应允许在插入\更新新的每日数据(双向表?)之前具有表版本的历史记录,以便有机会丢弃错误的数据并回滚./p>

4)应该有一个查询表,该表能够检测相同ID的重叠日期间隔(一段时间内强制ID完整性).

5)对于此类数据,应该有一些必要的索引. ID应该是外键.

6)是否应该有一个触发器,当新的数据到达负载表并执行1),2),3)中所述的操作时,该触发器起作用吗?

有人可以告诉文献或这类任务的已知方法吗? 我相信这是可以解决的任务,但我无法确定数据模型的正确名称,也无法找到此类数据的任何清晰示例.

解决方案

以下代码基于每日数据"更新实体状态表. 我创建了自己的样本数据,据我所知,它满足您的要求.

CREATE TABLE #DailyData( id INT, [state] VARCHAR( 10 ), val FLOAT, dat1 DATETIME )
CREATE TABLE #EntityState( id INT, [state] VARCHAR( 10 ), val FLOAT, dat1 DATETIME, dat2 DATETIME, [month] DATETIME )
INSERT INTO #EntityState
SELECT 1, 'st2', 4.6, '01-Jul-2017', '08-Jul-2017', '01-Jul-2017' UNION ALL
SELECT 1, 'st1', 1.2, '09-Jul-2017', '31-Jul-2017', '01-Jul-2017' UNION ALL
SELECT 2, 'st1', 2.2, '08-Jul-2017', '31-Jul-2017', '01-Jul-2017'

INSERT INTO #DailyData
SELECT 1, 'st3', 5.6, '15-Jul-2017' UNION ALL
SELECT 1, 'st2', 4.6, '01-Aug-2017' UNION ALL
SELECT 2, 'st2', 2.5, '10-Jul-2017'


SELECT 'Before', * FROM #EntityState
;WITH Q0 AS(
    -- Insert current state
    SELECT id, [state], val, dat1
    FROM
    ( SELECT id, [state], val, dat1, ROW_NUMBER() OVER( PARTITION BY id ORDER BY dat1 DESC ) AS LatestDate
    FROM #EntityState ) AS CurrentState
    WHERE LatestDate = 1
    UNION ALL
    -- Combine with Daily Import
    SELECT id, [state], val, dat1
    FROM #DailyData
)
,
-- Add Date order column
Q1 AS (
    SELECT id, [state], val, dat1, ROW_NUMBER() OVER( PARTITION BY id ORDER BY dat1 ASC ) AS DateOrder
    FROM Q0
    )
-- Merge new Entity states with Existing
MERGE INTO #EntityState AS T
    USING
            -- New Entity states 
            ( SELECT Prev.id, Prev.[state], Prev.val, Prev.dat1, Next.dat1 AS dat2
            FROM Q1 AS Prev
                LEFT JOIN Q1 AS Next ON Prev.id = Next.id AND Prev.DateOrder + 1 = Next.DateOrder ) AS S
        ON T.id = S.id AND T.dat1 = S.dat1 AND T.[state] = S.[state]
    -- Update dat2 on last day's state
    WHEN MATCHED AND T.dat2 = EOMONTH( T.dat1 )
        THEN UPDATE 
            SET T.dat2 = S.dat2
    -- Add new Entity State
    WHEN NOT MATCHED BY TARGET THEN
        INSERT( id, state, val, dat1, dat2, [month] )
            VALUES( S.id, S.state, S.val, S.dat1, ISNULL( S.dat2, EOMONTH( S.dat1 )),
                    DATEADD( month, DATEDIFF( month, 0, S.dat1 ), 0 ));

SELECT 'After', * FROM #EntityState

此代码改编自: 在切换间隔SQL Server上计算秒数

I have a daily incoming data with following format: (id, dat, state, val) where id is unique entity, dat is date (i.e. ISO 'yyyy-MM-dd'), state is a state of an entity, val is a value of an entity at this state.

I need to transform it into data like that format: (id, state, val, dat1, dat2, month) where dat1 and dat2 are datestamps between which the entity id has state state with val value within month.

I'm struggling to find out a correct way to collect and persist data like this format. I understand that solution must meet the following requirements:

1) If id has changed his state from st1 to st2 in new daily data(i.e. on 2017-01-27), there should be a row with closed period like that:

(id=1, state=st1, val=2.5, dat1='2017-01-15', dat2='2017-01-26', month='2017-01-01')

and a row for state st2 like that:

(id=1, state=st2, val=5.5, dat1='2017-01-27', dat2='2017-01-31', month='2017-01-01')

2) If on 2017-02-14 there is a new id for a table, there should be an inserted row like that:

(id=2, state=st3, val=1.5, dat1='2017-02-01', dat2='2017-02-28', month='2017-02-01'), so if there is a new id in daily data and this id doesn't exist in previous periods we put his birth date as beginning of the period and hold the closing date as end of month until there is a change of state of this id within February month.

3) A table should be designed in the way allowing having history of table versions before insert\update a new daily piece of data (bi-temporal table?) in order to have an opportunity to discard wrong data and rollback.

4) There should be a query for table which is able to detect overlapping date intervals of the same ID (force ID integrity over time).

5) There should be some necessary indexes for this kind of data. Id should be a foreign key.

6) Should there be a trigger which works when a new piece of data has arrived into load-table and makes operations described in 1),2),3) ?

Could anyone tell the literature or known approaches for this kind of tasks? I believe it is solved task but I couldn't determine the proper name of data model and find any clear examples for that kind of data.

解决方案

Below code updates Entity state table based on "Daily Data". I have created my own sample data, that to the best of my understanding fits your requirements.

CREATE TABLE #DailyData( id INT, [state] VARCHAR( 10 ), val FLOAT, dat1 DATETIME )
CREATE TABLE #EntityState( id INT, [state] VARCHAR( 10 ), val FLOAT, dat1 DATETIME, dat2 DATETIME, [month] DATETIME )
INSERT INTO #EntityState
SELECT 1, 'st2', 4.6, '01-Jul-2017', '08-Jul-2017', '01-Jul-2017' UNION ALL
SELECT 1, 'st1', 1.2, '09-Jul-2017', '31-Jul-2017', '01-Jul-2017' UNION ALL
SELECT 2, 'st1', 2.2, '08-Jul-2017', '31-Jul-2017', '01-Jul-2017'

INSERT INTO #DailyData
SELECT 1, 'st3', 5.6, '15-Jul-2017' UNION ALL
SELECT 1, 'st2', 4.6, '01-Aug-2017' UNION ALL
SELECT 2, 'st2', 2.5, '10-Jul-2017'


SELECT 'Before', * FROM #EntityState
;WITH Q0 AS(
    -- Insert current state
    SELECT id, [state], val, dat1
    FROM
    ( SELECT id, [state], val, dat1, ROW_NUMBER() OVER( PARTITION BY id ORDER BY dat1 DESC ) AS LatestDate
    FROM #EntityState ) AS CurrentState
    WHERE LatestDate = 1
    UNION ALL
    -- Combine with Daily Import
    SELECT id, [state], val, dat1
    FROM #DailyData
)
,
-- Add Date order column
Q1 AS (
    SELECT id, [state], val, dat1, ROW_NUMBER() OVER( PARTITION BY id ORDER BY dat1 ASC ) AS DateOrder
    FROM Q0
    )
-- Merge new Entity states with Existing
MERGE INTO #EntityState AS T
    USING
            -- New Entity states 
            ( SELECT Prev.id, Prev.[state], Prev.val, Prev.dat1, Next.dat1 AS dat2
            FROM Q1 AS Prev
                LEFT JOIN Q1 AS Next ON Prev.id = Next.id AND Prev.DateOrder + 1 = Next.DateOrder ) AS S
        ON T.id = S.id AND T.dat1 = S.dat1 AND T.[state] = S.[state]
    -- Update dat2 on last day's state
    WHEN MATCHED AND T.dat2 = EOMONTH( T.dat1 )
        THEN UPDATE 
            SET T.dat2 = S.dat2
    -- Add new Entity State
    WHEN NOT MATCHED BY TARGET THEN
        INSERT( id, state, val, dat1, dat2, [month] )
            VALUES( S.id, S.state, S.val, S.dat1, ISNULL( S.dat2, EOMONTH( S.dat1 )),
                    DATEADD( month, DATEDIFF( month, 0, S.dat1 ), 0 ));

SELECT 'After', * FROM #EntityState

This code was adapted from: Count seconds on switch interval SQL Server

这篇关于基于每日传入数据的实体的SQL构造和更新日期间隔的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆