窗口函数的确定性排序顺序 [英] Deterministic sort order for window functions

查看:171
本文介绍了窗口函数的确定性排序顺序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个状态表,我想获取最新的详细信息。

I've a status table and I want to fetch the latest details.

Slno |   ID | Status | date
  1  |   1  | Pass   | 15-06-2015 11:11:00 - this is inserted first
  2  |   1  | Fail   | 15-06-2015 11:11:00 - this is inserted second 
  3  |   2  | Fail   | 15-06-2015 12:11:11 - this is inserted first
  4  |   2  | Pass   | 15-06-2015 12:11:11 - this is inserted second

我使用带有按ID顺序按日期desc 进行分区以获取第一个值。

例外的输出:

I use a window function with partition by ID order by date desc to fetch the first value.
Excepted Output :

2  |   1  | Fail   | 15-06-2015 11:11:00 - this is inserted second
4  |   2  | Pass   | 15-06-2015 12:11:11 - this is inserted second

实际输出:

1  |   1  | Pass   | 15-06-2015 11:11:00 - this is inserted first
3  |   2  | Fail   | 15-06-2015 12:11:11 - this is inserted first

根据[ http://docs.aws.amazon.com/redshift/latest/dg/r_Examples_order_by_WF .html] ,向window函数添加第二个 ORDER BY 列可以解决该问题。但是我没有任何其他列来区分行!

还有另一种方法可以解决此问题吗?

According to [http://docs.aws.amazon.com/redshift/latest/dg/r_Examples_order_by_WF.html], adding a second ORDER BY column to the window function may solve the problem. But I don't have any other column to differentiate the rows!
Is there another approach to solve the issue?

编辑:我已经为了清楚起见,在此处添加了 slno 。我在表中没有 slno

I've added slno here for clarity. I don't have slno as such in the table!

我的SQL:

with range as (
  select id from status where date between 01-06-2015 and 30-06-2015
), latest as ( 
  select status, id, row_number() OVER (PARTITION BY id ORDER BY date DESC) row_num
)
select * from latest where row_num = 1


推荐答案

如果您没有 slno 在表中,那么您就没有可靠的信息,而是首先插入哪一行。表中没有自然顺序,行的物理顺序可以随时更改(进行任何更新,或使用 VACUUM 等)

If you don't have slno in your table, then you don't have any reliable information which row was inserted first. There is no natural order in a table, the physical order of rows can change any time (with any update, or with VACUUM, etc.)

可以使用 不可靠 的技巧:按内部 ctid

select *
from  (
   select id, status
        , row_number() OVER (PARTITION BY id
                             ORDER BY date, ctid) AS row_num
   from   status  -- that's your table name??
   where  date >= '2015-06-01'  -- assuming column is actually a date
   and    date <  '2015-07-01'
  ) sub
where  row_num = 1;




  • 在没有其他信息的情况下,哪一行排在最前面(这是设计错误,先对其进行修复!),您可以尝试使用内部元组ID ctid

    • In absence of any other information which row came first (which is a design error to begin with, fix it!), you might try to save what you can using the internal tuple ID ctid

      • In-order sequence generation

      行在最初插入时将保持物理顺序,但是通过对表的任何写操作或 VACUUM 或其他事件,行可以随时更改。

      这是最后的手段,它会中断。

      Rows will be in physical order when inserted initially, but that can change any time with any write operation to the table or VACUUM or other events.
      This is a measure of last resort and it will break.

      您的出现的查询在以下几个方面无效:第一个CTE中缺少列名,第二个CTE中缺少表名,...

      Your presented query was invalid on several counts: missing column name in 1st CTE, missing table name in 2nd CTE, ...

      您不需要CTE

      更简单,将 DISTINCT ON (注意事项 ctid 应用相同):

      Simpler with DISTINCT ON (considerations for ctid apply the same):

      SELECT DISTINCT ON (id)
             id, status
      FROM   status
      WHERE  date >= '2015-06-01'
      AND    date <  '2015-07-01'
      ORDER  BY id, date, ctid;
      




      • 选择每个GROUP BY组中的第一行?

        • Select first row in each GROUP BY group?
        • 这篇关于窗口函数的确定性排序顺序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆