SQL - 如果满足使用多个先前列的条件,则 LAG 以获取先前值 [英] SQL - LAG to get previous value if condition using multiple previous columns satisfied

查看:40
本文介绍了SQL - 如果满足使用多个先前列的条件,则 LAG 以获取先前值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个由以下人员创建的表:

I have a table created by:

CREATE TABLE #test_table 
(
id INT
,EventName VARCHAR(50)
,HomeTeam VARCHAR(25)
,Metric INT
)

INSERT INTO #test_table VALUES
(1, 'Team A vs Team B', 'Team A', 5),
(2, 'Team A vs Team B', 'Team A', 7),
(3, 'Team C vs Team D', 'Team C', 6),
(4, 'Team Z vs Team A', 'Team Z', 8),
(5, 'Team A vs Team B', 'Team A', 9),
(6, 'Team C vs Team D', 'Team C', 3),
(7, 'Team C vs Team D', 'Team C', 1),
(8, 'Team E vs Team F', 'Team E', 2)

结果:

id  EventName           HomeTeam    Metric
------------------------------------------
1   Team A vs Team B    Team A      5
2   Team A vs Team B    Team A      7
3   Team C vs Team D    Team C      6
4   Team Z vs Team A    Team Z      8
5   Team A vs Team B    Team A      9
6   Team C vs Team D    Team C      3
7   Team C vs Team D    Team C      1
8   Team E vs Team F    Team E      2

A 想要计算一个新列 PreviousMetricN,其中 N 可以是 1, 2, 3, ... 这显示了 Metric 的前一个值,但前提是HomeTeam 参与了之前的活动.例如:

A want to calculate a new column PreviousMetricN where N can be 1, 2, 3, ... which shows the previous value for Metric, but only if the HomeTeam was involved in the previous event. For example:

id  EventName           HomeTeam    Metric  PreviousMetric1 PreviousMetric2
------------------------------------------------------------------------
1   Team A vs Team B    Team A      5       NULL            NULL
2   Team A vs Team B    Team A      7       5               NULL
3   Team C vs Team D    Team C      6       NULL            NULL
4   Team Z vs Team A    Team Z      8       NULL            NULL
5   Team A vs Team B    Team A      9       8               7
6   Team C vs Team D    Team C      3       6               NULL
7   Team C vs Team D    Team C      1       3               6
8   Team E vs Team F    Team E      2       NULL            NULL

我一直在尝试使用 PARTITION BY 子句中的新分组变量的 LAG 变体,例如

I have been trying variations of LAG with a new grouping variable in the PARTITION BY clause such as

LAG(Metric) OVER(Partition by (CASE WHEN CHARINDEX(HomeTeam, EventName)>0 THEN 1 ELSE 0 END) ORDER BY id)

但没有任何成功.这怎么办?

but without any success. How can this be done?

我也在这里为熊猫问过这个问题:Pandas shift - 如果满足多个条件,则获取之前的值

I've also asked this question for Pandas here: Pandas shift - get previous value if multiple conditions satisfied

推荐答案

我在这里看不到使用窗口函数和单次扫描表格的答案.我们可以在单次扫描中执行此查询,如下所示:

I see no answer here that uses window functions and a single scan of the table. We can do this query in a single scan as follows:

让我们假设您在另一列中有 AwayTeam.

Let us assume you have the AwayTeam in another column.

如果你还没有这个并且你想从 EventData 中解析它:
我们可以使用:SUBSTRING(EventData, CHARINDEX(' vs ', EventData) + 4)
我敦促您遵循适当的规范化并将其创建为表格中的适当列.

If you don't have this yet and you wanted to parse it out of EventData:
We could use: SUBSTRING(EventData, CHARINDEX(' vs ', EventData) + 4)
I urge you to follow proper normalization and create this as a proper column in your table.

我们的算法是这样运行的:

Our algorithm runs like this:

  1. 使用CROSS APPLY
  2. 将两个团队相乘(逆透视)作为单独的行
  3. 使用LAG计算之前的Metric,按合并后的Team列进行分区
  4. 过滤掉翻倍的行,这样我们的原始行只有一行
  1. Multiply out (unpivot) the two teams as separate rows, using CROSS APPLY
  2. Calculate the previous Metrics using LAG, partitioning by the merged Team column
  3. Filter back down the doubled up rows, so that we only get a single row for each of our original ones

SELECT id, HomeTeam, AwayTeam, Metric, Prev1, Prev2, Prev3
FROM (

  SELECT *
    ,Prev1 = LAG(Metric, 1) OVER (PARTITION BY v.Team ORDER BY id)
    ,Prev2 = LAG(Metric, 2) OVER (PARTITION BY v.Team ORDER BY id)
    ,Prev3 = LAG(Metric, 3) OVER (PARTITION BY v.Team ORDER BY id)
    -- more of these ......
  FROM test_table
  CROSS APPLY (VALUES (HomeTeam, 1),(AwayTeam, 0)) AS v(Team,IsHome)
) AS t

WHERE IsHome = 1
-- ORDER BY id  --if necessary

重要的是,我们无需使用多种不同的排序、分区或排序,也无需使用自联接即可完成此操作.只需一次扫描.

Importantly, we can do this without the use of multiple different sorts, partitions or ordering, and without the use of a self-join. Just a single scan.

结果:

<头>
id家庭团队客队公制上一页上一个上一个
1A队B 队5(空)(空)(空)
2A队B 队75(空)(空)
3C 组团队 D6(空)(空)(空)
4Z 团队A队8(空)(空)(空)
5A队B 队9875
6C 组团队 D36(空)(空)
7C 组团队 D136(空)
8团队 EF 团队2(空)(空)(空)

这篇关于SQL - 如果满足使用多个先前列的条件,则 LAG 以获取先前值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆