Apache Spark 的 Hive 部署问题(集群模式) [英] Apache Spark's deployment issue (cluster-mode) with Hive

查看:25
本文介绍了Apache Spark 的 Hive 部署问题(集群模式)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

编辑:

我正在开发一个 Spark 应用程序,该应用程序从多个结构化架构中读取数据,并且我正在尝试从这些架构中聚合信息.我的应用程序在本地运行时运行良好.但是当我在集群上运行它时,我在配置(很可能是 hive-site.xml)或提交命令参数方面遇到了问题.我已经查找了其他相关帖子,但找不到针对我的场景的解决方案.我已经在下面详细提到了我尝试过的命令以及我遇到的错误.我是 Spark 的新手,我可能会遗漏一些琐碎的东西,但可以提供更多信息来支持我的问题.

I'm developing a Spark application that reads a data from the multiple structured schemas and I'm trying to aggregate the information from those schemas. My application runs well when I run it locally. But when I run it on a cluster, I'm having trouble with the configurations (most probably with hive-site.xml) or with the submit-command arguments. I've looked for the other related posts, but couldn't find the solution SPECIFIC to my scenario. I've mentioned what commands I tried and what errors I got in detail below. I'm new to Spark and I might be missing something trivial, but can provide more information to support my question.

原始问题:

我一直在尝试在捆绑了 HDP2.3 组件的 6 节点 Hadoop 集群中运行我的 spark 应用程序.

I've been trying to run my spark application in a 6-node Hadoop cluster bundled with HDP2.3 components.

以下组件信息可能对你们提出解决方案有用:

Here are component information that might be useful for you guys in suggesting the solutions:

集群信息:6节点集群:

128GB 内存24核8TB 硬盘

128GB RAM 24 core 8TB HDD

应用程序中使用的组件

HDP - 2.3

火花 - 1.3.1

$ hadoop 版本:

Hadoop 2.7.1.2.3.0.0-2557
Subversion git@github.com:hortonworks/hadoop.git -r 9f17d40a0f2046d217b2bff90ad6e2fc7e41f5e1
Compiled by jenkins on 2015-07-14T13:08Z
Compiled with protoc 2.5.0
From source with checksum 54f9bbb4492f92975e84e390599b881d

场景:

我正在尝试以某种方式使用 SparkContext 和 HiveContext,以充分利用 Spark 对其数据结构(如数据帧)的实时查询.我的应用程序中使用的依赖项是:

I'm trying to use the SparkContext and HiveContext in a way to take full advantage of the spark's real time query on it's data structure like dataframe. The dependencies used in my application are:

<dependency> <!-- Spark dependency -->
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-core_2.10</artifactId>
        <version>1.3.1</version>
    </dependency>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-sql_2.10</artifactId>
        <version>1.3.1</version>
    </dependency>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-hive_2.10</artifactId>
        <version>1.3.1</version>
    </dependency>
    <dependency>
        <groupId>com.databricks</groupId>
        <artifactId>spark-csv_2.10</artifactId>
        <version>1.4.0</version>
    </dependency>

以下是我收到的提交命令和相应的错误日志:

Below are the submit commands and the coresponding error logs that I'm getting:

提交命令 1:

spark-submit --class working.path.to.Main \
    --master yarn \
    --deploy-mode cluster \
    --num-executors 17 \
    --executor-cores 8 \
    --executor-memory 25g \
    --driver-memory 25g \
    --num-executors 5 \
    application-with-all-dependencies.jar

错误日志1:

User class threw exception: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient 

提交命令 2:

spark-submit --class working.path.to.Main \
    --master yarn \
    --deploy-mode cluster \
    --num-executors 17 \
    --executor-cores 8 \
    --executor-memory 25g \
    --driver-memory 25g \
    --num-executors 5 \
    --files /etc/hive/conf/hive-site.xml \
    application-with-all-dependencies.jar

错误日志2:

User class threw exception: java.lang.NumberFormatException: For input string: "5s" 

由于我没有管理权限,我无法修改配置.好吧,我可以联系 IT 工程师并进行更改,但我正在寻找如果可能的话,减少配置文件更改的解决方案!

Since I don't have the administrative permissions, I cannot modify the configuration. Well, I can contact to the IT engineer and make the changes, but I'm looking for the solution that involves less changes in the configuration files, if possible!

建议更改配置 此处.

然后我尝试按照其他论坛中的建议将各种 jar 文件作为参数传递.

Then I tried passing various jar files as arguments as suggested in other discussion forums.

提交命令 3:

spark-submit --class working.path.to.Main \
    --master yarn \
    --deploy-mode cluster \
    --num-executors 17 \
    --executor-cores 8 \
    --executor-memory 25g \
    --driver-memory 25g \
    --num-executors 5 \
    --jars /usr/hdp/2.3.0.0-2557/spark/lib/datanucleus-api-jdo-3.2.6.jar,/usr/hdp/2.3.0.0-2557/spark/lib/datanucleus-core-3.2.10.jar,/usr/hdp/2.3.0.0-2557/spark/lib/datanucleus-rdbms-3.2.9.jar \
    --files /etc/hive/conf/hive-site.xml \
    application-with-all-dependencies.jar

错误日志3:

User class threw exception: java.lang.NumberFormatException: For input string: "5s" 

我不明白以下命令发生了什么,无法分析错误日志.

I didn't understood what happened with the following command and couldn't analyze the error log.

提交命令 4:

spark-submit --class working.path.to.Main \
    --master yarn \
    --deploy-mode cluster \
    --num-executors 17 \
    --executor-cores 8 \
    --executor-memory 25g \
    --driver-memory 25g \
    --num-executors 5 \
    --jars /usr/hdp/2.3.0.0-2557/spark/lib/*.jar \
    --files /etc/hive/conf/hive-site.xml \
    application-with-all-dependencies.jar

提交日志4:

Application application_1461686223085_0014 failed 2 times due to AM Container for appattempt_1461686223085_0014_000002 exited with exitCode: 10
For more detailed output, check application tracking page:http://cluster-host:XXXX/cluster/app/application_1461686223085_0014Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_e10_1461686223085_0014_02_000001
Exit code: 10
Stack trace: ExitCodeException exitCode=10:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
at org.apache.hadoop.util.Shell.run(Shell.java:456)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 10
Failing this attempt. Failing the application. 

还有其他可能的选择吗?任何形式的帮助将不胜感激.如果您需要任何其他信息,请告诉我.

Any other possible options? Any kind of help will be highly appreciated. Please let me know if you need any other information.

谢谢.

推荐答案

here 适用于我的情况.hive-site.xml 驻留的两个位置可能会令人困惑.使用 --files/usr/hdp/current/spark-client/conf/hive-site.xml 而不是 --files/etc/hive/conf/hive-site.xml.我不必为我的配置添加 jars.希望这会帮助那些在类似问题上挣扎的人.谢谢.

The solution explained in here worked for my case. There are two locations hive-site.xml resides that might be confusing. Use --files /usr/hdp/current/spark-client/conf/hive-site.xml instead of --files /etc/hive/conf/hive-site.xml. I didn't have to add the jars for my configuration. Hope this will help someone struggling with the similar problem. Thanks.

这篇关于Apache Spark 的 Hive 部署问题(集群模式)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆