将多个日期作为参数传递给 Hive 查询 [英] passing multiple dates as a paramters to Hive query

查看:34
本文介绍了将多个日期作为参数传递给 Hive 查询的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将日期列表作为参数传递给我的 hive 查询.

I am trying to pass a list of dates as parameter to my hive query.

#!/bin/bash
echo "Executing the hive query - Get distinct dates"
var=`hive -S -e "select distinct  substr(Transaction_date,0,10) from test_dev_db.TransactionUpdateTable;"`
echo $var
echo "Executing the hive query - Get the parition data"
hive -hiveconf paritionvalue=$var -e 'SELECT Product FROM test_dev_db.TransactionMainHistoryTable where tran_date in("${hiveconf:paritionvalue}");'
echo "Hive query - ends"

输出为:

Executing the hive query - Get distinct dates
2009-02-01 2009-04-01
Executing the hive query - Get the parition data

Logging initialized using configuration in file:/hive/conf/hive-log4j.properties
OK
Product1
Product1
Product1
Product1
Product1
Product1
Time taken: 0.523 seconds, Fetched: 6 row(s)
Hive query - ends

它只将第一个日期作为输入.我想将我的日期传递为 ('2009-02-01','2009-04-01')注意:TransactionMainHistoryTable 在 tran_date 列上分区,字符串类型.

It's only taking only first date as input. I would like to pass my dates as ('2009-02-01','2009-04-01') Note:TransactionMainHistoryTable is partitioned on tran_date column with string type.

推荐答案

使用 collect_set 收集不同值的数组,并使用分隔符 ',' 将其连接起来.这将生成没有外引号 2009-02-01','2009-04-01 的列表,并在第二个脚本中添加外引号 ' ,或者您可以添加它们在第一个查询中.并且在内联 sql(-e 选项)中执行时,您不需要传递 hiveconf 变量,直接 shell 变量替换将起作用.从文件执行脚本时使用 hiveconf(-f 选项)

Collect array of distinct values using collect_set and concatenate it with delimiter ','. This will produce list without outer quotes 2009-02-01','2009-04-01 and in the second script add outer quotes ' also, or you can add them in the first query. And when executing in inline sql (-e option) you do not need to pass hiveconf variable, direct shell variable substitution will work. Use hiveconf when you are executing script from file (-f option)

工作示例(使用您的表而不是堆栈):

Working example (use your table instead of stack):

date_list=$(hive -S -e "select concat_ws('\',\'',collect_set(substr(dt,0,10))) from (select stack (2,'2017-01', '2017-02')as dt)s ;")

hive -e "select * from (select stack (2,'2017-01', '2017-02')as dt)s where dt in ('${date_list}');"

返回:

好的

2017-01
2017-02
Time taken: 1.221 seconds, Fetched: 2 row(s)

这篇关于将多个日期作为参数传递给 Hive 查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆