如何在其他查询的变量中使用 Hive 查询结果(多个) [英] How to use Hive Query results(multiple) in a variable for other query

查看:25
本文介绍了如何在其他查询的变量中使用 Hive 查询结果(多个)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两张表,一张是学校,一张是学生.我想找到特定学校的所有学生.学校的schema是:id, name, location学生是:id、name、schoolId.我写了以下脚本:

I have two tables one is schools and one is students.I want to find all the students of a particular school. The schema of schools is: id, name, location and of students is :id, name, schoolId. I wrote the following script:

schoolId=$(hive -e "set hive.cli.print.header=false;select id from school;")
 hive -hiveconf "schoolId"="$schoolId" 

hive>select id,name from student where schoolId like  '${hiveconf:schoolId}%'

我没有得到任何结果,因为 schoolId 将所有 id 存储在一起.例如,有 3 所学校的 id:123, 256,346schoolId 变量存储为 123 256 346,结果为空.

I dont get any result as schoolId stores all the id together.For example there are 3 schools with id: 123, 256,346 schoolId variable stores as 123 256 346 and the result is null.

推荐答案

使用 collect_set()concat_ws 来获取逗号分隔的字符串,ID 应该转换为字符串:

Use collect_set() with concat_ws to get comma delimited string, IDs should be cast to string:

schoolId=$(hive -e "set hive.cli.print.header=false;select concat_ws('\',\'',collect_set(cast(id as string))) from school;");

hive -hiveconf "schoolId"="$schoolId" 

然后使用 IN 运算符:

Then use IN operator:

select id,name from student where schoolId in ('${hiveconf:schoolId}');

这篇关于如何在其他查询的变量中使用 Hive 查询结果(多个)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆