Hive / Beeline,如何设置作业.staging目录? [英] Hive/Beeline, how can I set the job .staging directory?

查看:585
本文介绍了Hive / Beeline,如何设置作业.staging目录?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在集群上,我正在为每个用户提供60GB的Hadoop配额。
历史上我正在处理的项目会生成很多Hive查询。
为了让事情工作更快,我试着平行这些查询(这是不相关的),但结果是/user/{myusername}/.staging/目录正在填充job_ {someid}目录反过来充满了蜂巢罐,并非常快地消耗这些60GB。尽管我可以限制并行化因素,但我还想看看是否可以让Hive将这些jar放在不同的目录中。说/ tmp / {myusername}我有更多的空间。



任何想法如何告诉Hive / Beeline创建.staging目录在/ tmp / {



我们发现以下工作:
$ b 直线--hiveconf hive.exec.stagingdir = / tmp / {myusername}


On the cluster I'm working on every user is given 60GB of Hadoop quota. Historically the project I'm working on generates a lot of Hive queries. In order for things to work faster I'm trying to parallel these queries (which are unrelated) but as a result the directory /user/{myusername}/.staging/ is being filled with job_{someid} directories which in turn are filled with the hive jars and consume these 60GB very fast. While I can limit the parallelization factor I would also like to see if I can ask Hive to put these jars on a different directory. Say /tmp/{myusername} where I have a lot more space.

Any idea how do I tell Hive/Beeline to create the .staging directory under /tmp/{myusername}?

解决方案

The above doesn't work.

We found the following working

beeline --hiveconf hive.exec.stagingdir=/tmp/{myusername}

这篇关于Hive / Beeline,如何设置作业.staging目录?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆