使用Hiveql进行循环 [英] Looping using Hiveql

查看：288 发布时间：2018/6/12 13:51:29 hive hiveql

本文介绍了使用Hiveql进行循环的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我试图合并2个数据集，比如说A和B.数据集A有一个变量Flag，它取2个值。而不是将两个数据合并在一起，我试图根据标志变量合并2个数据集。

合并代码如下：

  create table new_data as 
在a = bx

$中选择一个。*，由A中的
作为左连接B作为b
b
$ b

因为我通过CLI运行Hive代码，所以我通过以下命令调用它：

  hive -f new_data.hql

我调用合并的代码的循环部分基于Flag变量的数据如下：

 用于1 2中的标志; 
 do 
 hive -hivevar flag = $ flag -f new_data.hql 
 done

我把上面的代码放在另一个叫做.hql的文件中：

  hive -f loop_data。 hql

但是它引发错误。

无法识别'for''flag''附近的输入'有人可以告诉我我在哪里'b错误。

谢谢！

解决方案

您应该将循环逻辑添加到shell脚本中。

文件名：loop_data.sh
用于1 2中的标志; do hive -hivevar flag = $ flag -f new_data.hql done
并执行如下脚本：

sh loop_data.sh

在new_data.hql脚本中，您正在创建表。既然你应该分出DDL& DML分为两个独立的脚本。像

DDL：create_new_data.hql

create table new_data as 选择 a。*， by from A作为左连接 B为b ax = bx 其中 1 = 0;
DML：insert_new_data.hql
insert into new_data 选择 a。*， by from A作为左连接 B as b ax = bx 其中 flag = $ {hiveconf：flag}
并更新您的shell脚本，如：

文件名：loop_new_data.sh
＃创建表格 hive -f create_new_data.hql ＃插入数据作为标志在1 2; do hive -hiveconf flag = $ flag -f insert_new_data.hql done
然后执行它：

sh loop_new_data.sh
让我知道您是否需要更多信息。

I'm trying to merge 2 datasets, say A and B. The dataset A has a variable "Flag" which takes 2 values. Rather than jut merging both data together I was trying to merge 2 datasets based on "flag" variable.

The merging code is the following:
create table new_data as select a.*,b.y from A as a left join B as b on a.x=b.x
Since I'm running Hive code through CLI, I'm calling this through the following command
hive -f new_data.hql
The looping part of the code I'm calling to merge data based on "Flag" variable is the following:
for flag in 1 2; do hive -hivevar flag=$flag -f new_data.hql done
I put the above code in another ".hql" file asn calling it:
hive -f loop_data.hql
But it's throwing error.

cannot recognize input near 'for' 'flag' 'in'

Can anybody please tell me where I'm making mistake.

Thanks!
解决方案

You should be adding the loop logic to shell script.

File Name: loop_data.sh
for flag in 1 2; do hive -hivevar flag=$flag -f new_data.hql done
And execute the script like:
sh loop_data.sh

In your new_data.hql script, you are creating table. Since you should split out the DDL & DML in 2 separate scripts. Like

DDL: create_new_data.hql
create table new_data as select a.*, b.y from A as a left join B as b on a.x = b.x where 1 = 0;
DML: insert_new_data.hql
insert into new_data select a.*, b.y from A as a left join B as b on a.x = b.x where flag = ${hiveconf:flag}
And update you shell script like:

File Name: loop_new_data.sh
# Create table hive -f create_new_data.hql # Insert data for flag in 1 2; do hive -hiveconf flag=$flag -f insert_new_data.hql done
And execute it like:
sh loop_new_data.sh
Let me know if you want more info.

这篇关于使用Hiveql进行循环的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用Hiveql进行循环 [英] Looping using Hiveql

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

使用Hiveql进行循环 [英] Looping using Hiveql

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭