Oracle'Partition By'和'Row_Number'关键字 [英] Oracle 'Partition By' and 'Row_Number' keyword

查看:97
本文介绍了Oracle'Partition By'和'Row_Number'关键字的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个别人写的SQL查询,我试图弄清楚它的作用.有人可以在这里解释Partition ByRow_Number关键字的作用,并给出一个简单的示例,以及为什么要使用它吗?

I have a SQL query written by someone else and I'm trying to figure out what it does. Can someone please explain what the Partition By and Row_Number keywords does here and give a simple example of it in action, as well as why one would want to use it?

按以下方式进行分区的示例:

An example of partition by:

(SELECT cdt.*,
        ROW_NUMBER ()
        OVER (PARTITION BY cdt.country_code, cdt.account, cdt.currency
              ORDER BY cdt.country_code, cdt.account, cdt.currency)
           seq_no
   FROM CUSTOMER_DETAILS cdt);

我在网上看到了一些示例,它们的深度太深了.

I've seen some examples online, they are in bit too depth.

提前谢谢!

推荐答案

PARTITION BY分离集合,这使您能够独立处理(ROW_NUMBER(),COUNT(),SUM()等)相关集合

PARTITION BY segregate sets, this enables you to be able to work(ROW_NUMBER(),COUNT(),SUM(),etc) on related set independently.

在您的查询中,相关集由具有相似cdt.country_code,cdt.account,cdt.currency的行组成.当您在这些列上进行分区并在它们上应用ROW_NUMBER时.这些组合/集合上的其他列将从ROW_NUMBER接收顺序号

In your query, the related set comprised of rows with similar cdt.country_code, cdt.account, cdt.currency. When you partition on those columns and you apply ROW_NUMBER on them. Those other columns on those combination/set will receive sequential number from ROW_NUMBER

但是该查询很有趣,如果您的分区由一些唯一的数据组成,并且您在其上放置了row_number,则它将产生相同的数字.就像您在保证唯一的分区上执行ORDER BY.例如,将GUID视为cdt.country_code, cdt.account, cdt.currency

But that query is funny, if your partition by some unique data and you put a row_number on it, it will just produce same number. It's like you do an ORDER BY on a partition that is guaranteed to be unique. Example, think of GUID as unique combination of cdt.country_code, cdt.account, cdt.currency

newid()产生GUID,那么您期望该表达式什么?

newid() produces GUID, so what shall you expect by this expression?

select
   hi,ho,
   row_number() over(partition by newid() order by hi,ho)
from tbl;

...正确,所有已分区(没有分区,每一行都在自己的行中分区)行的row_numbers都设置为1

...Right, all the partitioned(none was partitioned, every row is partitioned in their own row) rows' row_numbers are all set to 1

基本上,您应该在非唯一列上进行分区. OVER上的ORDER BY需要PARTITION BY具有非唯一组合,否则所有row_numbers都将变为1

Basically, you should partition on non-unique columns. ORDER BY on OVER needed the PARTITION BY to have a non-unique combination, otherwise all row_numbers will become 1

一个例子,这是您的数据:

An example, this is your data:

create table tbl(hi varchar, ho varchar);

insert into tbl values
('A','X'),
('A','Y'),
('A','Z'),
('B','W'),
('B','W'),
('C','L'),
('C','L');

然后这类似于您的查询:

Then this is analogous to your query:

select
   hi,ho,
   row_number() over(partition by hi,ho order by hi,ho)
from tbl;

这将是什么输出?

HI  HO  COLUMN_2
A   X   1
A   Y   1
A   Z   1
B   W   1
B   W   2
C   L   1
C   L   2

您看到HI HO的组合吗?前三行具有唯一的组合,因此将它们设置为1,B行的W相同,因此ROW_NUMBERS不同,HI C行也是如此.

You see thee combination of HI HO? The first three rows has unique combination, hence they are set to 1, the B rows has same W, hence different ROW_NUMBERS, likewise with HI C rows.

现在,为什么在那里需要ORDER BY?如果以前的开发人员只想在相似的数据上添加row_number(例如HI B,所有数据都是B-W,B-W),那么他可以这样做:

Now, why is the ORDER BY needed there? If the previous developer merely want to put a row_number on similar data (e.g. HI B, all data are B-W, B-W), he can just do this:

select
   hi,ho,
   row_number() over(partition by hi,ho)
from tbl;

但是可惜,Oracle(和Sql Server也不允许)没有ORDER BY的分区;而在Postgresql中,PARTITION上的ORDER BY是可选的: http://www.sqlfiddle.com /#!1/27821/1

But alas, Oracle(and Sql Server too) doesn't allow partition with no ORDER BY; whereas in Postgresql, ORDER BY on PARTITION is optional: http://www.sqlfiddle.com/#!1/27821/1

select
   hi,ho,
   row_number() over(partition by hi,ho)
from tbl;

您分区上的ORDER BY看起来有点多余,不是因为以前的开发人员的过错,某些数据库只是不允许PARTITION而没有ORDER BY,他可能找不到合适的候选列进行排序在.如果PARTITION BY列和ORDER BY列都相同,则删除ORDER BY,但是由于某些数据库不允许,您可以这样做:

Your ORDER BY on your partition look a bit redundant, not because of the previous developer's fault, some database just don't allow PARTITION with no ORDER BY, he might not able find a good candidate column to sort on. If both PARTITION BY columns and ORDER BY columns are the same just remove the ORDER BY, but since some database don't allow it, you can just do this:

SELECT cdt.*,
        ROW_NUMBER ()
        OVER (PARTITION BY cdt.country_code, cdt.account, cdt.currency
              ORDER BY newid())
           seq_no
   FROM CUSTOMER_DETAILS cdt

您找不到适合对相似数据进行排序的好列吗?您最好随机进行排序,但分区数据仍然具有相同的值.您可以使用GUID为例(对于SQL Server使用newid()).因此,它具有与以前的开发人员相同的输出,很遗憾,某些数据库不允许PARTITION没有ORDER BY

You cannot find a good column to use for sorting similar data? You might as well sort on random, the partitioned data have the same values anyway. You can use GUID for example(you use newid() for SQL Server). So that has the same output made by previous developer, it's unfortunate that some database doesn't allow PARTITION with no ORDER BY

尽管确实如此,但是我无法找到将相同的组合上的数字(上面的示例中为B-W,B-W)放置一个很好的理由.它给人的印象是具有冗余数据的数据库.不知何故让我想起了这个问题:

Though really, it eludes me and I cannot find a good reason to put a number on the same combinations (B-W, B-W in example above). It's giving the impression of database having redundant data. Somehow reminded me of this: How to get one unique record from the same list of records from table? No Unique constraint in the table

看到具有与ORDER BY相同列组合的PARTITION BY看起来真的很不可思议,无法轻易推断出代码的意图.

It really looks arcane seeing a PARTITION BY with same combination of columns with ORDER BY, can not easily infer the code's intent.

实时测试: http://www.sqlfiddle.com/#!3/27821 /6

但是,正如dbaseman也注意到的那样,在同一列上进行分区和排序是没有用的.

But as dbaseman have noticed also, it's useless to partition and order on same columns.

您有一组这样的数据:

create table tbl(hi varchar, ho varchar);

insert into tbl values
('A','X'),
('A','X'),
('A','X'),
('B','Y'),
('B','Y'),
('C','Z'),
('C','Z');

然后,您对PARTITION BY hi,ho;然后您按订单.编号相似的数据是没有道理的:-) http://www.sqlfiddle.com/#! 3/29ab8/3

Then you PARTITION BY ho; and then you ORDER BY ho. There's no sense numbering similar data :-) http://www.sqlfiddle.com/#!3/29ab8/3

select
   hi,ho,
   row_number() over(partition by hi,ho order by hi,ho) as nr
from tbl;

输出:

HI  HO  ROW_QUERY_A
A   X   1
A   X   2
A   X   3
B   Y   1
B   Y   2
C   Z   1
C   Z   2

看到了吗?为什么需要将行号放在相同的组合上?您将在三元组A,X,双元组B,Y,双元组C,Z上分析什么? :-)

See? Why need to put row numbers on same combination? What you will analyze on triple A,X, on double B,Y, on double C,Z? :-)

您只需要在非唯一列上使用PARTITION,然后对非唯一列的唯一 -ing列进行排序.示例将使其更加清晰:

You just need to use PARTITION on non-unique column, then you sort on non-unique column(s)'s unique-ing column. Example will make it more clear:

create table tbl(hi varchar, ho varchar);

insert into tbl values
('A','D'),
('A','E'),
('A','F'),
('B','F'),
('B','E'),
('C','E'),
('C','D');

select
   hi,ho,
   row_number() over(partition by hi order by ho) as nr
from tbl;

PARTITION BY hi在非唯一列上操作,然后在每个分区列上,按其唯一列(ho)进行排序,ORDER BY ho

PARTITION BY hi operates on non unique column, then on each partitioned column, you order on its unique column(ho), ORDER BY ho

输出:

HI  HO  NR
A   D   1
A   E   2
A   F   3
B   E   1
B   F   2
C   D   1
C   E   2

该数据集更有意义

实时测试: http://www.sqlfiddle.com/#!3/d0b44 /1

这与您的查询类似,在PARTITION BY和ORDER BY上具有相同的列:

And this is similar to your query with same columns on both PARTITION BY and ORDER BY:

select
   hi,ho,
   row_number() over(partition by hi,ho order by hi,ho) as nr
from tbl;

这是输出:

HI  HO  NR
A   D   1
A   E   1
A   F   1
B   E   1
B   F   1
C   D   1
C   E   1

看到了吗?没道理吗?

实时测试: http://www.sqlfiddle.com/#!3/d0b44 /3

最后这可能是正确的查询:

Finally this might be the right query:

SELECT cdt.*,
     ROW_NUMBER ()
     OVER (PARTITION BY cdt.country_code, cdt.account -- removed: cdt.currency
           ORDER BY 
               -- removed: cdt.country_code, cdt.account, 
               cdt.currency) -- keep
        seq_no
FROM CUSTOMER_DETAILS cdt

这篇关于Oracle'Partition By'和'Row_Number'关键字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆