ROW_NUMBER()OVER(PARTITION BY ...)使用麻烦 [英] Trouble using ROW_NUMBER() OVER (PARTITION BY ...)

查看:184
本文介绍了ROW_NUMBER()OVER(PARTITION BY ...)使用麻烦的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用SQL Server 2008 R2.我有一个名为EmployeeHistory的表,具有以下结构和示例数据:

I'm using SQL Server 2008 R2. I have table called EmployeeHistory with the following structure and sample data:

EmployeeID Date      DepartmentID SupervisorID
10001      20130101  001          10009
10001      20130909  001          10019
10001      20131201  002          10018
10001      20140501  002          10017
10001      20141001  001          10015
10001      20141201  001          10014

请注意,随着时间的推移,员工10001一直在更换2个部门和几个主管.我要做的是在日期"字段中按顺序列出每个部门中该雇员的工作的开始和结束日期.因此,输出将如下所示:

Notice that the Employee 10001 has been changing 2 departments and several supervisors over time. What I am trying to do is to list the start and end dates of this employee's employment in each Department ordered by the Date field. So, the output will look like this:

EmployeeID DateStart DateEnd  DepartmentID 
10001      20130101  20131201 001
10001      20131201  20141001 002
10001      20141001  NULL     001

我打算使用以下查询对数据进行分区,但失败了.部门从001更改为002,然后又更改为001.显然,我无法按DepartmentID进行分区...我敢肯定,我忽略了明显的部分.有什么帮助吗?谢谢你.

I intended to use partitioning the data using the following query but it failed. The Department changes from 001 to 002 and then back to 001. Obviously I cannot partition by DepartmentID... I'm sure I'm overlooking the obvious. Any help? Thank you, in advance.

SELECT * ,ROW_NUMBER() OVER (PARTITION BY EmployeeID, DepartmentID
ORDER BY [Date]) RN FROM EmployeeHistory

推荐答案

涉及到一点点.最简单的方法是参考我为您创建的此SQL提琴,它会产生确切的结果.您可以通过多种方法出于性能或其他方面的考虑对其进行改进,但是希望至少应该比某些替代方案更清楚.

A bit involved. Easiest would be to refer to this SQL Fiddle I created for you that produces the exact result. There are ways you can improve it for performance or other considerations, but this should hopefully at least be clearer than some alternatives.

要点是,您首先要对数据进行规范的排名,然后使用该排名将数据分为几组,然后为每个组找到结束日期,然后消除所有中间行. ROW_NUMBER()和CROSS APPLY在可读性方面有很大帮助.

The gist is, you get a canonical ranking of your data first, then use that to segment the data into groups, then find an end date for each group, then eliminate any intermediate rows. ROW_NUMBER() and CROSS APPLY help a lot in doing it readably.

EDIT 2019:

EDIT 2019:

实际上,由于某种原因,SQL Fiddle确实被破坏了,但是在SQL Fiddle站点上似乎是一个问题.这是一个完整的版本,刚刚在SQL Server 2016上进行了测试:

The SQL Fiddle does in fact seem to be broken, for some reason, but it appears to be a problem on the SQL Fiddle site. Here's a complete version, tested just now on SQL Server 2016:

CREATE TABLE Source
(
  EmployeeID int,
  DateStarted date,
  DepartmentID int
)

INSERT INTO Source
VALUES
(10001,'2013-01-01',001),
(10001,'2013-09-09',001),
(10001,'2013-12-01',002),
(10001,'2014-05-01',002),
(10001,'2014-10-01',001),
(10001,'2014-12-01',001)


SELECT *, 
  ROW_NUMBER() OVER (PARTITION BY EmployeeID ORDER BY DateStarted) AS EntryRank,
  newid() as GroupKey,
  CAST(NULL AS date) AS EndDate
INTO #RankedData
FROM Source
;

UPDATE #RankedData
SET GroupKey = beginDate.GroupKey
FROM #RankedData sup
  CROSS APPLY 
  (
    SELECT TOP 1 GroupKey
    FROM #RankedData sub 
    WHERE sub.EmployeeID = sup.EmployeeID AND
      sub.DepartmentID = sup.DepartmentID AND
      NOT EXISTS 
        (
          SELECT * 
          FROM #RankedData bot 
          WHERE bot.EmployeeID = sup.EmployeeID AND
            bot.EntryRank BETWEEN sub.EntryRank AND sup.EntryRank AND
            bot.DepartmentID <> sup.DepartmentID
        )
      ORDER BY DateStarted ASC
    ) beginDate (GroupKey);

UPDATE #RankedData
SET EndDate = nextGroup.DateStarted
FROM #RankedData sup
  CROSS APPLY 
  (
    SELECT TOP 1 DateStarted
    FROM #RankedData sub
    WHERE sub.EmployeeID = sup.EmployeeID AND
      sub.DepartmentID <> sup.DepartmentID AND
      sub.EntryRank > sup.EntryRank
    ORDER BY EntryRank ASC
  ) nextGroup (DateStarted);

SELECT * FROM 
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY GroupKey ORDER BY EntryRank ASC) AS GroupRank FROM #RankedData
) FinalRanking
WHERE GroupRank = 1
ORDER BY EntryRank;

DROP TABLE #RankedData
DROP TABLE Source

这篇关于ROW_NUMBER()OVER(PARTITION BY ...)使用麻烦的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆