如何在其他查询的变量中使用Hive查询结果(多个) [英] How to use Hive Query results(multiple) in a variable for other query
问题描述
我有两张桌子,一张是学校,一张是学生.我想找到一所特定学校的所有学生. 学校的模式是:id,名称,位置 and of students是:id,名称,schoolId. 我写了以下脚本:
I have two tables one is schools and one is students.I want to find all the students of a particular school. The schema of schools is: id, name, location and of students is :id, name, schoolId. I wrote the following script:
schoolId=$(hive -e "set hive.cli.print.header=false;select id from school;")
hive -hiveconf "schoolId"="$schoolId"
hive>select id,name from student where schoolId like '${hiveconf:schoolId}%'
我没有得到任何结果,因为schoolId将所有ID都存储在一起.例如,有3所学校的ID:123、256,346 schoolId变量存储为123 256 346,结果为空.
I dont get any result as schoolId stores all the id together.For example there are 3 schools with id: 123, 256,346 schoolId variable stores as 123 256 346 and the result is null.
推荐答案
使用collect_set()
和concat_ws
来获取逗号分隔的字符串,ID应该转换为字符串:
Use collect_set()
with concat_ws
to get comma delimited string, IDs should be cast to string:
schoolId=$(hive -e "set hive.cli.print.header=false;select concat_ws('\\',\\'',collect_set(cast(id as string))) from school;");
hive -hiveconf "schoolId"="$schoolId"
然后使用IN运算符:
select id,name from student where schoolId in ('${hiveconf:schoolId}');
这篇关于如何在其他查询的变量中使用Hive查询结果(多个)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!