如何更改在AWS数据管道中运行的Hive Activity的内存设置? [英] How to change memory settings for Hive Activity running in AWS data pipeline?

查看:83
本文介绍了如何更改在AWS数据管道中运行的Hive Activity的内存设置?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用 AWS Data Pipeline 运行一个 Hive Activity 时,我的Hive活动失败,并出现以下错误:

While running one Hive Activity using AWS Data Pipeline, my Hive activity is failing with following error:

Diagnostics: Container [pid=,containerID=] is running beyond physical memory limits. 
Current usage: 1.0 GB of 1 GB physical memory used;
2.8 GB of 5 GB virtual memory used. Killing container. 

当我运行由Hive Activity手动执行的Hive脚本时,我必须按如下所示执行它:

When I ran Hive script which was getting executed by Hive Activity manually, I had to execute it as shown below:

hive \
-hiveconf tez.am.resource.memory.mb=16000 \
-hiveconf mapreduce.map.memory.mb=10240 \
-hiveconf mapreduce.map.java.opts=-Xmx8192m \
-hiveconf mapreduce.reduce.memory.mb=10240 \
-hiveconf mapreduce.reduce.java.opts=-Xmx8192m \
-hiveconf hive.exec.parallel=true
-f <hive script file path.>

通过这些设置,Hive脚本可以完美执行.

With these settings Hive script executes perfectly.

现在的问题是如何将这些设置传递给AWS数据管道的Hive活动?我似乎找不到任何将 -hiveconf 传递给Hive活动的方法.

Now question is how do I pass these settings to Hive Activity of AWS data pipeline? I can't seem to find any way to pass -hiveconf to Hive activity.

推荐答案

您如何在DataPipeline中调用蜂巢脚本?如果您使用ShellCommandActivity,则应该能够像在命令行上一样传递这些-hiveconf,并且应该可以正常运行.

How are you calling your hive script within DataPipeline ? If you use ShellCommandActivity you should be able to pass these -hiveconf as you would do on a command line and it should run fine.

这篇关于如何更改在AWS数据管道中运行的Hive Activity的内存设置?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆