创建在列的特定值上重置的排名 [英] Creating a rank that resets on a specific value of a column

查看：19 发布时间：2021/12/28 23:49:12 sql hive hiveql

本文介绍了创建在列的特定值上重置的排名的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我当前的数据是这样的(注意它是按日期时间排序的):

My current data looks like this (note that it is sorted on datetime):

+----------------+---------------------+---------+
| CustomerNumber |        Date         | Channel |
+----------------+---------------------+---------+
|      120584446 | 2015-05-22 21:16:05 |    A    |
|      120584446 | 2015-05-25 18:04:16 |    A    |
|      120584446 | 2015-05-25 18:05:25 |    B    |
|      120584446 | 2015-05-28 20:35:09 |    A    |
|      120584446 | 2015-05-28 20:36:01 |    A    |
|      120584446 | 2015-05-28 20:37:02 |    B    |
|      120584446 | 2015-05-29 13:39:00 |    B    |
+----------------+---------------------+---------+
I want to create a rank in hive that splits on cutomer number and whenever the channel is A. It should look like this:

+----------------+---------------------+----------------+------+
| CustomerNumber |          Date       |        Channel | Rank |
+----------------+---------------------+----------------+------+
|      120584446 | 2015-05-22 21:16:05 |    A           |    1 |
|      120584446 | 2015-05-25 18:04:16 |    A           |    1 |
|      120584446 | 2015-05-25 18:05:25 |    B           |    2 |
|      120584446 | 2015-05-28 20:35:09 |    A           |    1 |
|      120584446 | 2015-05-28 20:36:01 |    A           |    1 |
|      120584446 | 2015-05-28 20:37:02 |    B           |    2 |
|      120584446 | 2015-05-29 13:39:00 |    B           |    3 |
+----------------+---------------------+----------------+------+

推荐答案

一种方法是使用累积条件总和来识别组，然后使用 row_number() 进行排名:

One approach is to use a cumulative conditional sum to identify the groups and then use row_number() for the ranking:

select t.*,
       row_number() over (partition by CustomerNumber, grp
                          order by date
                         ) as rank
from (select t.*,
             sum(case when channel = 'A' then 1 else 0 end) over
                 (partition by CustomerNumber order by date) as grp
      from t
     ) t;

这篇关于创建在列的特定值上重置的排名的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

创建在列的特定值上重置的排名 [英] Creating a rank that resets on a specific value of a column

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

创建在列的特定值上重置的排名 [英] Creating a rank that resets on a specific value of a column

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭