如何在 HIVE 脚本中设置变量 [英] How to set variables in HIVE scripts

查看:81
本文介绍了如何在 HIVE 脚本中设置变量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在 Hive QL 中寻找与 SET varname = value 等效的 SQL

我知道我可以做这样的事情:

SET CURRENT_DATE = '2012-09-16';SELECT * FROM foo WHERE day >= @CURRENT_DATE

但后来我收到此错误:

<块引用>

此处不支持字符@"

解决方案

您需要使用特殊的 hiveconf 进行变量替换.例如

hive>设置 CURRENT_DATE='2012-09-16';蜂巢>select * from foo where day >= ${hiveconf:CURRENT_DATE}

同样,你可以通过命令行:

% hive -hiveconf CURRENT_DATE='2012-09-16' -f test.hql

请注意,还有 envsystem 变量,因此您可以参考 ${env:USER} 例如.>

要查看所有可用变量,请从命令行运行

% hive -e 'set;'

或从 hive 提示符运行

hive>放;

更新:我也开始使用 hivevar 变量,将它们放入 hql 片段中,我可以使用 source 命令(或从命令行作为 -i 选项传递)).这样做的好处是变量可以在有或没有 hivevar 前缀的情况下使用,并允许类似于全局和本地使用的东西.

因此,假设有一些设置表名变量的 setup.hql:

set hivevar:tablename=mytable;

然后,我可以带入蜂巢:

hive>source/path/to/setup.hql;

并在查询中使用:

hive>从 ${tablename} 中选择 *

hive>从 ${hivevar:tablename} 中选择 *

我也可以设置一个本地"tablename,这会影响 ${tablename} 的使用,但不会影响 ${hivevar:tablename}

hive>设置表名=新表;蜂巢>select * from ${tablename} -- 使用'newtable'

对比

hive>select * from ${hivevar:tablename} -- 仍然使用原来的'mytable'

可能对 CLI 没有太大意义,但可以在使用 source 的文件中包含 hql,但可以本地"设置一些变量在脚本的其余部分使用.

I'm looking for the SQL equivalent of SET varname = value in Hive QL

I know I can do something like this:

SET CURRENT_DATE = '2012-09-16';
SELECT * FROM foo WHERE day >= @CURRENT_DATE

But then I get this error:

character '@' not supported here

解决方案

You need to use the special hiveconf for variable substitution. e.g.

hive> set CURRENT_DATE='2012-09-16';
hive> select * from foo where day >= ${hiveconf:CURRENT_DATE}

similarly, you could pass on command line:

% hive -hiveconf CURRENT_DATE='2012-09-16' -f test.hql

Note that there are env and system variables as well, so you can reference ${env:USER} for example.

To see all the available variables, from the command line, run

% hive -e 'set;'

or from the hive prompt, run

hive> set;

Update: I've started to use hivevar variables as well, putting them into hql snippets I can include from hive CLI using the source command (or pass as -i option from command line). The benefit here is that the variable can then be used with or without the hivevar prefix, and allow something akin to global vs local use.

So, assume have some setup.hql which sets a tablename variable:

set hivevar:tablename=mytable;

then, I can bring into hive:

hive> source /path/to/setup.hql;

and use in query:

hive> select * from ${tablename}

or

hive> select * from ${hivevar:tablename}

I could also set a "local" tablename, which would affect the use of ${tablename}, but not ${hivevar:tablename}

hive> set tablename=newtable;
hive> select * from ${tablename} -- uses 'newtable'

vs

hive> select * from ${hivevar:tablename} -- still uses the original 'mytable'

Probably doesn't mean too much from the CLI, but can have hql in a file that uses source, but set some of the variables "locally" to use in the rest of the script.

这篇关于如何在 HIVE 脚本中设置变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆