将查询结果存储在变量中 [英] Storing query result in a variable

查看:117
本文介绍了将查询结果存储在变量中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个查询,其结果我想存储在一个变量
中。我该怎么做?
我试过了

  ./ hive -euse telecom; insert overwrite local directory'/ tmp / result'select 
avg(a)from abc;

./hive --hiveconf MY_VAR =`cat / tmp / result / 000000_0`;

我能够在MY_VAR中获得平均值,但它需要我不需要的配置单元CLI b $ b,有没有一种方法可以访问hive CLI中的unix命令? 使用案例在mysql中,以下内容是有效的:


set @max_date:=从some_table中选择max(日期);

select * from some_other_table where date> @max_date;

这对于需要重复调​​用此变量的脚本非常有用,因为您只需要执行最大日期查询一次,而不是每次调用该变量时。



HIVE目前不支持此功能。我错了!我一直试图弄清楚如何在整个下午完成这项工作)



我的解决方法是存储所需的变量在一个足够小的表中可以将连接映射到其中的查询中它被使用。因为连接是一个地图而不是广播连接,所以它不会对性能产生重大影响。例如:


drop table if exists var_table;

创建表var_table作为

从some_table中选择max(date)作为max_date;


select some_other_table。*

from some_other_table

left join var_table

where some_other_table.date> var_table.max_date;

@visakh建议的解决方案不是最优的,因为存储字符串'select count(1)from table_name'而不是返回值,因此在脚本期间需要重复调​​用var的情况下不会有帮助。


I have a query whose result I wanted to store in a variable How can I do it ? I tried

./hive -e  "use telecom;insert overwrite local directory '/tmp/result' select
avg(a) from abc;"

./hive --hiveconf MY_VAR =`cat /tmp/result/000000_0`;

I am able to get average value in MY_VAR but it takes me in hive CLI which is not required and is there a way to access unix commands inside hive CLI?

解决方案

Use Case: in mysql the following is valid:

set @max_date := select max(date) from some_table;
select * from some_other_table where date > @max_date;

This is super useful for scripts that need to repeatedly call this variable since you only need to execute the max date query once rather than every time the variable is called.

HIVE does not currently support this. (please correct me if I'm wrong! I have been trying to figure out how to do this all all afternoon)

My workaround is to store the required variable in a table that is small enough to map join onto the query in which it is used. Because the join is a map rather than a broadcast join it should not significantly hurt performance. For example:

drop table if exists var_table;

create table var_table as
select max(date) as max_date from some_table;

select some_other_table.*
from some_other_table
left join var_table
where some_other_table.date > var_table.max_date;

The suggested solution by @visakh is not optimal because stores the string 'select count(1) from table_name;' rather than the returned value and so will not be helpful in cases where you need to call a var repeatedly during a script.

这篇关于将查询结果存储在变量中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆