基于每日传入数据的实体的SQL构造和更新日期间隔 [英] SQL construct and renew date intervals for entities based on daily incoming data
问题描述
我每天都有以下格式的传入数据:
(id, dat, state, val)
其中,id
是唯一实体,dat
是日期(即ISO'yyyy-MM-dd'),state
是实体的状态,val
是该状态下的实体的值.
我需要将其转换为以下格式的数据:(id, state, val, dat1, dat2, month)
其中dat1
和dat2
是日期戳,在该日期戳之间,实体id
的状态为state
,值在一个月内为val
.
我正在努力寻找一种正确的方法来收集和保存这种格式的数据.我了解解决方案必须满足以下要求:
1)如果id
在新的每日数据(即2017年1月27日)中将其state
从st1
更改为st2
,则应该有一个封闭期的行,如下所示:>
(id=1, state=st1, val=2.5, dat1='2017-01-15', dat2='2017-01-26', month='2017-01-01')
和状态st2
的一行,像这样:
(id=1, state=st2, val=5.5, dat1='2017-01-27', dat2='2017-01-31', month='2017-01-01')
2)如果在2017年2月14日有一个新的表id
,则应该插入这样的行:
(id=2, state=st3, val=1.5, dat1='2017-02-01', dat2='2017-02-28', month='2017-02-01')
,
因此,如果每日数据中有一个新的ID,而在以前的期间中不存在此ID,我们会将他的出生日期作为该期间的开始,而将结束日期保留为月底,直到该ID的状态在内部发生变化2月.
3)设计表的方式应允许在插入\更新新的每日数据(双向表?)之前具有表版本的历史记录,以便有机会丢弃错误的数据并回滚./p>
4)应该有一个查询表,该表能够检测相同ID的重叠日期间隔(一段时间内强制ID完整性).
5)对于此类数据,应该有一些必要的索引. ID应该是外键.
6)是否应该有一个触发器,当新的数据到达负载表并执行1),2),3)中所述的操作时,该触发器起作用吗?
有人可以告诉文献或这类任务的已知方法吗? 我相信这是可以解决的任务,但我无法确定数据模型的正确名称,也无法找到此类数据的任何清晰示例.
以下代码基于每日数据"更新实体状态表. 我创建了自己的样本数据,据我所知,它满足您的要求.
CREATE TABLE #DailyData( id INT, [state] VARCHAR( 10 ), val FLOAT, dat1 DATETIME )
CREATE TABLE #EntityState( id INT, [state] VARCHAR( 10 ), val FLOAT, dat1 DATETIME, dat2 DATETIME, [month] DATETIME )
INSERT INTO #EntityState
SELECT 1, 'st2', 4.6, '01-Jul-2017', '08-Jul-2017', '01-Jul-2017' UNION ALL
SELECT 1, 'st1', 1.2, '09-Jul-2017', '31-Jul-2017', '01-Jul-2017' UNION ALL
SELECT 2, 'st1', 2.2, '08-Jul-2017', '31-Jul-2017', '01-Jul-2017'
INSERT INTO #DailyData
SELECT 1, 'st3', 5.6, '15-Jul-2017' UNION ALL
SELECT 1, 'st2', 4.6, '01-Aug-2017' UNION ALL
SELECT 2, 'st2', 2.5, '10-Jul-2017'
SELECT 'Before', * FROM #EntityState
;WITH Q0 AS(
-- Insert current state
SELECT id, [state], val, dat1
FROM
( SELECT id, [state], val, dat1, ROW_NUMBER() OVER( PARTITION BY id ORDER BY dat1 DESC ) AS LatestDate
FROM #EntityState ) AS CurrentState
WHERE LatestDate = 1
UNION ALL
-- Combine with Daily Import
SELECT id, [state], val, dat1
FROM #DailyData
)
,
-- Add Date order column
Q1 AS (
SELECT id, [state], val, dat1, ROW_NUMBER() OVER( PARTITION BY id ORDER BY dat1 ASC ) AS DateOrder
FROM Q0
)
-- Merge new Entity states with Existing
MERGE INTO #EntityState AS T
USING
-- New Entity states
( SELECT Prev.id, Prev.[state], Prev.val, Prev.dat1, Next.dat1 AS dat2
FROM Q1 AS Prev
LEFT JOIN Q1 AS Next ON Prev.id = Next.id AND Prev.DateOrder + 1 = Next.DateOrder ) AS S
ON T.id = S.id AND T.dat1 = S.dat1 AND T.[state] = S.[state]
-- Update dat2 on last day's state
WHEN MATCHED AND T.dat2 = EOMONTH( T.dat1 )
THEN UPDATE
SET T.dat2 = S.dat2
-- Add new Entity State
WHEN NOT MATCHED BY TARGET THEN
INSERT( id, state, val, dat1, dat2, [month] )
VALUES( S.id, S.state, S.val, S.dat1, ISNULL( S.dat2, EOMONTH( S.dat1 )),
DATEADD( month, DATEDIFF( month, 0, S.dat1 ), 0 ));
SELECT 'After', * FROM #EntityState
此代码改编自: 在切换间隔SQL Server上计算秒数
I have a daily incoming data with following format:
(id, dat, state, val)
where id
is unique entity, dat
is date (i.e. ISO 'yyyy-MM-dd'), state
is a state of an entity, val
is a value of an entity at this state.
I need to transform it into data like that format: (id, state, val, dat1, dat2, month)
where dat1
and dat2
are datestamps between which the entity id
has state state
with val
value within month.
I'm struggling to find out a correct way to collect and persist data like this format. I understand that solution must meet the following requirements:
1) If id
has changed his state
from st1
to st2
in new daily data(i.e. on 2017-01-27), there should be a row with closed period like that:
(id=1, state=st1, val=2.5, dat1='2017-01-15', dat2='2017-01-26', month='2017-01-01')
and a row for state st2
like that:
(id=1, state=st2, val=5.5, dat1='2017-01-27', dat2='2017-01-31', month='2017-01-01')
2) If on 2017-02-14 there is a new id
for a table, there should be an inserted row like that:
(id=2, state=st3, val=1.5, dat1='2017-02-01', dat2='2017-02-28', month='2017-02-01')
,
so if there is a new id in daily data and this id doesn't exist in previous periods we put his birth date as beginning of the period and hold the closing date as end of month until there is a change of state of this id within February month.
3) A table should be designed in the way allowing having history of table versions before insert\update a new daily piece of data (bi-temporal table?) in order to have an opportunity to discard wrong data and rollback.
4) There should be a query for table which is able to detect overlapping date intervals of the same ID (force ID integrity over time).
5) There should be some necessary indexes for this kind of data. Id should be a foreign key.
6) Should there be a trigger which works when a new piece of data has arrived into load-table and makes operations described in 1),2),3) ?
Could anyone tell the literature or known approaches for this kind of tasks? I believe it is solved task but I couldn't determine the proper name of data model and find any clear examples for that kind of data.
Below code updates Entity state table based on "Daily Data". I have created my own sample data, that to the best of my understanding fits your requirements.
CREATE TABLE #DailyData( id INT, [state] VARCHAR( 10 ), val FLOAT, dat1 DATETIME )
CREATE TABLE #EntityState( id INT, [state] VARCHAR( 10 ), val FLOAT, dat1 DATETIME, dat2 DATETIME, [month] DATETIME )
INSERT INTO #EntityState
SELECT 1, 'st2', 4.6, '01-Jul-2017', '08-Jul-2017', '01-Jul-2017' UNION ALL
SELECT 1, 'st1', 1.2, '09-Jul-2017', '31-Jul-2017', '01-Jul-2017' UNION ALL
SELECT 2, 'st1', 2.2, '08-Jul-2017', '31-Jul-2017', '01-Jul-2017'
INSERT INTO #DailyData
SELECT 1, 'st3', 5.6, '15-Jul-2017' UNION ALL
SELECT 1, 'st2', 4.6, '01-Aug-2017' UNION ALL
SELECT 2, 'st2', 2.5, '10-Jul-2017'
SELECT 'Before', * FROM #EntityState
;WITH Q0 AS(
-- Insert current state
SELECT id, [state], val, dat1
FROM
( SELECT id, [state], val, dat1, ROW_NUMBER() OVER( PARTITION BY id ORDER BY dat1 DESC ) AS LatestDate
FROM #EntityState ) AS CurrentState
WHERE LatestDate = 1
UNION ALL
-- Combine with Daily Import
SELECT id, [state], val, dat1
FROM #DailyData
)
,
-- Add Date order column
Q1 AS (
SELECT id, [state], val, dat1, ROW_NUMBER() OVER( PARTITION BY id ORDER BY dat1 ASC ) AS DateOrder
FROM Q0
)
-- Merge new Entity states with Existing
MERGE INTO #EntityState AS T
USING
-- New Entity states
( SELECT Prev.id, Prev.[state], Prev.val, Prev.dat1, Next.dat1 AS dat2
FROM Q1 AS Prev
LEFT JOIN Q1 AS Next ON Prev.id = Next.id AND Prev.DateOrder + 1 = Next.DateOrder ) AS S
ON T.id = S.id AND T.dat1 = S.dat1 AND T.[state] = S.[state]
-- Update dat2 on last day's state
WHEN MATCHED AND T.dat2 = EOMONTH( T.dat1 )
THEN UPDATE
SET T.dat2 = S.dat2
-- Add new Entity State
WHEN NOT MATCHED BY TARGET THEN
INSERT( id, state, val, dat1, dat2, [month] )
VALUES( S.id, S.state, S.val, S.dat1, ISNULL( S.dat2, EOMONTH( S.dat1 )),
DATEADD( month, DATEDIFF( month, 0, S.dat1 ), 0 ));
SELECT 'After', * FROM #EntityState
This code was adapted from: Count seconds on switch interval SQL Server
这篇关于基于每日传入数据的实体的SQL构造和更新日期间隔的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!