少数列的Hive / SQL捆绑列,其余列是基于拉的最低/最高的其他列 [英] Hive/SQL bundling columns for few columns,rest of the columns are pull based lowest/highest of other columns

查看:142
本文介绍了少数列的Hive / SQL捆绑列,其余列是基于拉的最低/最高的其他列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个配置单元表,如下面的5列

  name orderno productcategory数量描述
KJFSFKS 1 1 40 D1
KJFSFKS 2 2 50 D2
KJFSFKS 3 2 67 D3
KJFSFKS 4 2 10 D4
KJFSFKS 5 3 2 D5
KJFSFKS 6 3 5 D6
KJFSFKS 7 3 6 D7
KJFSFKS 8 4 8 D8
KJFSFKS 9 5 8 D9
KJFSFKS 10 5 10 D10

基于相同产品类别代码的所需输出,如果产品类别代码在多行添加数量字段中相同,请根据最高orderno选择描述,orderno始终为picklowest,输出如下

 名字orderno产品类别金额说明
KJFSFKS 1 1 40 D1
KJFSFKS 2 2 127 D4
KJFSFKS 5 3 13 D7
KJFSFKS 8 4 8 D8
KJFSFKS 9 5 18 D10

如上所述,一些字段是按照某种顺序排列的,其他按不同顺序排列。

我用g roup by but sum(amount)很好,关于description字段,它是基于orderno列的,在我的需求中还有其他列,我应该根据订单号挑选。

解决方案

  select name,orderno,productcategory,amount,description 
from

select name,orderno (产品类别),
总额(金额)(按名称划分,productcategory)金额,
first_value(说明)over(按名称划分,productcategory order by orderno desc)说明,
row_number() (按名称划分,order by orderno的产品类别订单)$ your

)s其中rn = 1; - 检查最低订单号






  OK 
KJFSFKS 1 1 40 D1
KJFSFKS 2 2 127 D4
KJFSFKS 5 3 13 D7
KJFSFKS 8 4 8 D8
KJFSFKS 9 5 18 D10
所用时间:12.492秒,已提取:5行


i have a hive table as below with 5 columns

name orderno productcategory amount description
KJFSFKS 1   1   40  D1
KJFSFKS 2   2   50  D2
KJFSFKS 3   2   67  D3
KJFSFKS 4   2   10  D4
KJFSFKS 5   3   2   D5
KJFSFKS 6   3   5   D6
KJFSFKS 7   3   6   D7
KJFSFKS 8   4   8   D8
KJFSFKS 9   5   8   D9
KJFSFKS 10  5   10  D10

desired output based on same product category code, if productcategory code is same across multiple rows add amount field, pick the description based on highest orderno, orderno always picklowest, output as below

name    orderno productcategory amount  description
KJFSFKS 1   1   40  D1
KJFSFKS 2   2   127 D4
KJFSFKS 5   3   13  D7
KJFSFKS 8   4   8   D8
KJFSFKS 9   5   18  D10

As said above,some fields are in some order, other in different order

i used group by but sum(amount) is fine, what about description field, it is based on orderno column, also there are other columns in my requirement where i should pick based on order number

解决方案

select name, orderno,  productcategory,  amount,   description 
from 
(
select name, orderno, productcategory, 
       sum(amount) over(partition by name, productcategory) amount, 
       first_value(description) over(partition by name, productcategory order by orderno desc) description,
       row_number() over (partition by name, productcategory order by orderno) rn
from  your_table
)s where rn=1; --pick lowest orderno 


OK
KJFSFKS 1       1       40      D1
KJFSFKS 2       2       127     D4
KJFSFKS 5       3       13      D7
KJFSFKS 8       4       8       D8
KJFSFKS 9       5       18      D10
Time taken: 12.492 seconds, Fetched: 5 row(s)

这篇关于少数列的Hive / SQL捆绑列,其余列是基于拉的最低/最高的其他列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆