Hive on Spark CDH 5.7-无法创建Spark客户端 [英] Hive on Spark CDH 5.7 - Failed to create spark client

查看:71
本文介绍了Hive on Spark CDH 5.7-无法创建Spark客户端的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们在使用spark引擎执行配置单元查询时遇到错误.

We are getting the error while executing the hive queries with spark engine.

执行spark任务失败,例外'org.apache.hadoop.hive.ql.metadata.HiveException(创建失败Spark客户端.)'失败:执行错误,从处返回代码1org.apache.hadoop.hive.ql.exec.spark.SparkTask

Failed to execute spark task, with exception 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create spark client.)' FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.spark.SparkTask

以下属性设置为使用spark作为执行引擎,而不是mapreduce:

The following properties are set to use spark as the execution engine instead of mapreduce:

set hive.execution.engine=spark;
set spark.executor.memory=2g;

我也尝试更改以下属性.

I tried changing the following properties also.

 set yarn.scheduler.maximum-allocation-mb=2048;
    set yarn.nodemanager.resource.memory-mb=2048;
    set spark.executor.cores=4;
    set spark.executor.memory=4g;
    set spark.yarn.executor.memoryOverhead=750;
    set hive.spark.client.server.connect.timeout=900000ms;

我需要设置其他属性吗?有人可以建议吗?

Do I need to set some other properties? Can anyone suggest?

推荐答案

似乎YARN容器内存小于Spark Executor的要求.请设置YARN容器的内存和最大值,使其大于Spark Executor内存+开销.

Seems like e YARN Container Memory was smaller than the Spark Executor requirement. Please set the YARN Container memory and maximum to be greater than Spark Executor Memory + Overhead.

  1. yarn.scheduler.maximum-allocation-mb
  2. yarn.nodemanager.resource.memory-mb

yarn.nodemanager.resource.memory-mb:

可以分配给容器的物理内存量(以MB为单位).这意味着YARN可以在该节点上使用的内存量,因此该属性应低于该计算机的总内存.

Amount of physical memory, in MB, that can be allocated for containers. It means the amount of memory YARN can utilize on this node and therefore this property should be lower then the total memory of that machine.

<name>yarn.nodemanager.resource.memory-mb</name>
<value>40960</value> <!-- 40 GB -->

下一步是就如何将可用的总资源分配到容器中提供YARN指导.为此,您可以指定要分配给容器的RAM的最小单位.

The next step is to provide YARN guidance on how to break up the total resources available into Containers. You do this by specifying the minimum unit of RAM to allocate for a Container.

在yarn-site.xml中

In yarn-site.xml

<name>yarn.scheduler.minimum-allocation-mb</name> <!-- RAM-per-container ->
 <value>2048</value>

yarn.scheduler.maximum-allocation-mb:

它定义了以MB为单位的容器可用的最大内存分配

It defines the maximum memory allocation available for a container in MB

这意味着RM只能以"yarn.scheduler.minimum-allocation-mb"为增量向容器分配内存,而不能超过"yarn.scheduler.maximum-allocation-mb",并且分配的内存不应超过总分配的内存节点.

it means RM can only allocate memory to containers in increments of "yarn.scheduler.minimum-allocation-mb" and not exceed "yarn.scheduler.maximum-allocation-mb" and It should not be more then total allocated memory of the Node.

在yarn-site.xml中

In yarn-site.xml

<name>yarn.scheduler.maximum-allocation-mb</name> <!-Max RAM-per-container->
 <value>8192</value>

也转到Spark History Server:在YARN服务实例上转到Spark>历史记录服务器>历史记录服务WebUI>单击相关作业>单击在相关的失败的作业"上>单击该作业的失败阶段,然后查看用于详细信息"部分.

Also go to Spark History Server: goto Spark on YARN service Instance > History Server > History Service WebUI > Click on relevant job > Click on relevant Failed Job > Click on failed stages for that job and look for the "details" section.

这篇关于Hive on Spark CDH 5.7-无法创建Spark客户端的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆