当增量值重新开始时如何排除行? [英] How do I exclude rows when an incremental value starts over?
问题描述
我是一个新手海报,但在这里花了很多时间研究答案.我不太清楚如何使用 SQL Server 2008 R2 创建 SQL 结果集,该结果集可能应该使用更现代版本的领先/滞后.我试图根据一列的排序来聚合数据,但每个序列中可能有不同数量的实例.我知道序列结束的唯一方法是下一行的序列号较低.所以它可能是 1-2、1-2-3-4、1-2-3,我必须弄清楚如何从中制作 3 个聚合.
I am a newbie poster but have spent a lot of time researching answers here. I can't quite figure out how to create a SQL result set using SQL Server 2008 R2 that should probably be using lead/lag from more modern versions. I am trying to aggregate data based on sequencing of one column, but there can be varying numbers of instances in each sequence. The only way I know a sequence has ended is when the next row has a lower sequence number. So it may go 1-2, 1-2-3-4, 1-2-3, and I have to figure out how to make 3 aggregates out of that.
源数据是连接表,看起来像这样(请帮我格式化):
Source data is joined tables that look like this (please help me format):
recordID instanceDate moduleID iResult interactionNum
1356 10/6/15 16:14 1 68 1
1357 10/7/15 16:22 1 100 2
1434 10/9/15 16:58 1 52 1
1435 10/11/15 17:00 1 60 2
1436 10/15/15 16:57 1 100 3
1437 10/15/15 16:59 1 100 4
我需要找到一种方法,根据最后一列中的值,将本示例中的前 2 行与后 4 行分开.
I need to find a way to separate the first 2 rows from the last 4 rows in this example, based on values in the last column.
我希望最终得到的是一个看起来像这样的结果集,它根据分组平均 iResult 列并从分组中获取第一个 instanceDate:
What I would love to ultimately get is a result set that looks like this, which averages the iResult column based on the grouping and takes the first instanceDate from the grouping:
instanceDate moduleID iResult
10/6/15 1 84
10/9/15 1 78
如果我能找到一种方法来分隔组,我可以使用 MIN 和 AVG 进行聚合以获得此结果.数据按 instanceDate 排序(请忽略此处的日期格式),然后当查询找到一个交互数小于前一行的行时,应该发生交互数和组分离(通常以 '1' 重新开始,但不是总是,所以更喜欢只用较低或相等的整数值分隔).
I can aggregate to get this result using MIN and AVG if I can just find a way to separate the groups. The data is ordered by instanceDate (please ignore the date formatting here) then interactionNum and the group separation should happen when the query finds a row where the interactionNum is <= than the previous row (will usually start over with '1' but not always, so prefer just to separate on a lower or equal integer value).
这是我到目前为止的查询(包括提供上述数据集的连接):
Here is the query I have so far (includes the joins that give the above data set):
SELECT
X.*
FROM
(SELECT TOP 100 PERCENT
instanceDate, b.ModuleID, iResult, b.interactionNum
FROM
(firstTable a
INNER JOIN
secondTable b ON b.someID = a.someID)
WHERE
a.someID = 2
AND b.otherID LIKE 'xyz'
AND a.ModuleID = 1
ORDER BY
instanceDate) AS X
OUTER APPLY
(SELECT TOP 1
*
FROM
(SELECT
instanceDate, d.ModuleID, iResult, d.interactionNum
FROM
(firstTable c
INNER JOIN
secondTable d ON d.someID = c.someID)
WHERE
c.someID = 2
AND d.otherID LIKE 'xyz'
AND c.ModuleID = 1
AND d.interactionNum = X.interactionNum
AND c.instanceDate < X.instanceDate) X2
ORDER BY
instanceDate DESC) Y
WHERE
NOT EXISTS (SELECT Y.interactionNum INTERSECT SELECT X.interactionNum)
但这会返回一个像这样的临时结果集:
But this is returning an interim result set like this:
instanceDate ModuleID iResult interactionNum
10/6/15 16:10 1 68 1
10/6/15 16:14 1 100 2
10/15/15 16:57 1 100 3
10/15/15 16:59 1 100 4
问题是interactionNum 3, 4 不属于这个结果集.当我遍历此查询时,它们将进入下一个结果集.如何将它们排除在此迭代中的结果集之外?我需要这个查询的结果集只包含前两行,看到"源数据的第 3 行的 interactionNum 值低于第 2 行.
and the problem is that interactionNum 3, 4 do not belong in this result set. They would go in the next result set when I loop over this query. How do I keep them out of the result set in this iteration? I need the result set from this query to just include the first two rows, 'seeing' that row 3 of the source data has a lower value for interactionNum than row 2 has.
推荐答案
不确定应该使用什么 ModuleID,但我猜你正在寻找这样的东西:
Not sure what ModuleID was supposed to be used, but I guess you're looking for something like this:
select min (instanceDate), [moduleID], avg([iResult])
from (
select *,row_number() over (partition by [moduleID] order by instanceDate) as RN
from Table1
) X
group by [moduleID], RN - [interactionNum]
这里的想法是为每个moduleid创建一个带有row_number的运行编号,然后使用它与InteractionNum之间的差异作为分组标准.
The idea here is to create a running number with row_number for each moduleid, and then use the difference between that and InteractionNum as grouping criteria.
这篇关于当增量值重新开始时如何排除行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!