如何编写查询以从数据快照中提取单个更改? [英] How can I write a query to extract individual changes from snapshots of data?

查看:21
本文介绍了如何编写查询以从数据快照中提取单个更改?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要创建一个进程来从表中提取更改,其中每一行都是另一个表中一行的快照.现实世界中的问题涉及很多表和很多字段,但作为一个简单的例子,假设我有以下快照数据:

I need to create a process that will extract the changes from a table where each row is a snapshot of a row in another table. The real-world problem involves many tables with many fields, but as a simple example, suppose that I have the following snapshot data:

Sequence    DateTaken      ID       Field1    Field2
--------    -----------    ----     ------    ------
       1    '2011-01-01'      1     'Red'          2
       2    '2011-01-01'      2     'Blue'        10
       3    '2011-02-01'      1     'Green'        2
       4    '2011-03-01'      1     'Green'        3
       5    '2011-03-01'      2     'Purple'       2
       6    '2011-04-01'      1     'Yellow'       2

SequenceDateTaken 字段直接与快照表本身相关.ID 字段是源表的主键,Field1Field2 是同一个(源)表中的其他字段.

The Sequence and DateTaken fields relate directly to the snapshot table itself. The ID field is the primary key of the source table and Field1 and Field2 are other fields in the same (source) table.

我可以通过这样的查询获得部分解决方案:

I can get part-way to a solution with a query like this:

WITH Snapshots (Sequence, DateTaken, ID, Field1, Field2, _Index)
AS
(
    SELECT Sequence, DateTaken, ID, Field1, Field2, ROW_NUMBER() OVER (ORDER BY ID, Sequence) _Index
    FROM #Snapshots
)
SELECT
      c.DateTaken, c.ID
    , c.Field1 Field1_Current, p.Field1 Field1_Previous, CASE WHEN c.Field1 = p.Field1 THEN 0 ELSE 1 END Field1_Changed
    , c.Field2 Field2_Current, p.Field2 Field2_Previous, CASE WHEN c.Field2 = p.Field2 THEN 0 ELSE 1 END Field2_Changed
FROM Snapshots c
JOIN Snapshots p ON p.ID = c.ID AND (p._Index + 1) = c._Index
ORDER BY c.Sequence DESC

上面的查询将识别从一个快照到下一个快照的变化,但它仍然不是我需要的形式.输出中的每一行都可能包含多项更改.在一天结束时,我需要每次更改一行以标识更改的字段及其先前/当前值.实际未更改的字段需要从最终输出中排除.所以如果上面的查询输出是这样的:

The above query will identify what is changing from one snapshot to the next, but it is still not in the form that I need. Each row in the output may contain several changes. At the end of the day, I need one row per change that identifies what field was changed, along with its previous/current values. Fields that have not actually changed will need to be excluded from the final output. So if the above query output is like this:

DateTaken   ID  Field1_Current  Field1_Previous  Field1_Changed  Field2_Current  Field2_Previous  Field2_Changed
----------  --  --------------  ---------------  --------------  --------------  ---------------  --------------
2011-04-01  1   Yellow          Green            1               2               3                1
2011-02-01  1   Green           Red              1               2               2                0

我需要把它转换成这样:

I need to transform that into something like this:

DateTaken   ID  Field    Previous   Current
----------  --  -------  --------   ---------
2011-04-01  1   Field1   Green      Yellow
2011-04-01  1   Field2   3          2
2011-02-01  1   Field1   Red        Green

我以为我可以通过 UNPIVOT 到达那里,但我一直没能做到.我认为任何涉及游标或类似方法的解决方案都绝对是最后的手段.

I thought I might be able to get there with UNPIVOT, but I've not been able to make that work. I consider any solution involving cursors or similar to be an absolute last resort.

非常感谢您的建议.

推荐答案

这是一个使用 UNPIVOT 的工作示例.它基于我对我的问题的回答 更好的方法来部分取消透视SQL 中的对

Here's a working sample that uses UNPIVOT. It's based on my answer to my question Better way to Partially UNPIVOT in Pairs in SQL

这有一些不错的功能.

  1. 添加其他字段很容易.只需将值添加到 SELECT 和 UNPIVOT 子句.您不必添加额外的 UNION 子句

  1. Adding additional fields is easy. Just add values to the SELECT and UNPIVOT clause. You don't have to add additional UNION clauses

where 子句 WHERE curr.value <>无论添加了多少字段,prev.value 都不会改变.

The where clause WHERE curr.value <> prev.value never changes regardless of how many fields are added.

性能出奇地快.

如果您需要,它可以移植到当前版本的 Oracle

Its portable to Current versions of Oracle if you need that

<小时>

SQL

Declare @Snapshots as table(
Sequence int,
DateTaken      datetime,
[id] int,
field1 varchar(20),
field2 int)



INSERT INTO @Snapshots VALUES 

      (1,    '2011-01-01',      1,     'Red',          2),
      (2,    '2011-01-01',      2,     'Blue',        10),
      (3,    '2011-02-01',      1,     'Green',        2),
      (4,    '2011-03-01',      1,     'Green' ,       3),
      (5,    '2011-03-01',      2,     'Purple',       2),
      (6,    '2011-04-01',      1,     'Yellow',       2)

;WITH Snapshots (Sequence, DateTaken, ID, Field1, Field2, _Index)
AS
(
    SELECT Sequence, DateTaken, ID, Field1, Field2, ROW_NUMBER() OVER (ORDER BY ID, Sequence) _Index
    FROM @Snapshots
)
,  data as(
SELECT
     c._Index
    , c.DateTaken
    ,  c.ID
    , cast(c.Field1  as varchar(max)) Field1
    , cast(p.Field1  as varchar(max))Field1_Previous
    , cast(c.Field2   as varchar(max))Field2
    , cast(p.Field2  as varchar(max)) Field2_Previous 


FROM Snapshots c
JOIN Snapshots p ON p.ID = c.ID AND (p._Index + 1) = c._Index
)


, fieldsToRows 
     AS (SELECT DateTaken, 
                id,
                _Index,
                value,
                field

         FROM   data p UNPIVOT (value FOR field IN (field1, field1_previous, 
                                                        field2, field2_previous) ) 
                AS unpvt
        ) 
SELECT 
    curr.DateTaken,
    curr.ID,
    curr.field,
    prev.value previous,
    curr.value 'current'

FROM 
        fieldsToRows curr 
        INNER  JOIN  fieldsToRows prev
        ON curr.ID = prev.id
            AND curr._Index = prev._Index 
            AND curr.field + '_Previous' = prev.field
WHERE 
    curr.value <> prev.value

<小时>

输出

DateTaken               ID          field     previous current
----------------------- ----------- --------- -------- -------
2011-02-01 00:00:00.000 1           Field1    Red      Green
2011-03-01 00:00:00.000 1           Field2    2        3
2011-04-01 00:00:00.000 1           Field1    Green    Yellow
2011-04-01 00:00:00.000 1           Field2    3        2
2011-03-01 00:00:00.000 2           Field1    Blue     Purple
2011-03-01 00:00:00.000 2           Field2    10       2

这篇关于如何编写查询以从数据快照中提取单个更改?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆