阿帕奇星火的部署问题(集群模式)蜂巢 [英] Apache Spark's deployment issue (cluster-mode) with Hive

查看:705
本文介绍了阿帕奇星火的部署问题(集群模式)蜂巢的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

修改

我正在开发一个应用程序的Spark从多个结构化模式读取数据,我想聚集来自这些模式的信息。我的应用程序运行良好,当我在本地运行。但是,当我在集群上运行它时,我遇到了麻烦配置(最有可能与蜂房的site.xml)或提交 - 命令参数。我看过的其他相关职位,但未能找到解决具体到我的方案。我已经提到过我试过,什么错误我详细下文了什么命令。我是新来的火花,我可能会丢失一些小事,但能提供更多的信息来支持我的问题。

I'm developing a Spark application that reads a data from the multiple structured schemas and I'm trying to aggregate the information from those schemas. My application runs well when I run it locally. But when I run it on a cluster, I'm having trouble with the configurations (most probably with hive-site.xml) or with the submit-command arguments. I've looked for the other related posts, but couldn't find the solution SPECIFIC to my scenario. I've mentioned what commands I tried and what errors I got in detail below. I'm new to Spark and I might be missing something trivial, but can provide more information to support my question.

原题:

我一直运行在HDP2.3组件捆绑6个节点的Hadoop集群火花我的应用程序。

I've been trying to run my spark application in a 6-node Hadoop cluster bundled with HDP2.3 components.

下面是可能在暗示解决方案是为你们有用成分的信息:

Here are component information that might be useful for you guys in suggesting the solutions:

集群信息:6个节点的集群:

128GB RAM
24芯
8TB硬盘

128GB RAM 24 core 8TB HDD

应用程序中使用组件

HDP - 2.3

星火 - 1.3.1

Spark - 1.3.1

$的Hadoop版本:

Hadoop 2.7.1.2.3.0.0-2557
Subversion git@github.com:hortonworks/hadoop.git -r 9f17d40a0f2046d217b2bff90ad6e2fc7e41f5e1
Compiled by jenkins on 2015-07-14T13:08Z
Compiled with protoc 2.5.0
From source with checksum 54f9bbb4492f92975e84e390599b881d

情景:

我试图用SparkContext和HiveContext的方式采取类似数据帧它的数据结构,充分利用了火花的实时查询。在我的应用程序中使用的依赖关系是:

I'm trying to use the SparkContext and HiveContext in a way to take full advantage of the spark's real time query on it's data structure like dataframe. The dependencies used in my application are:

<dependency> <!-- Spark dependency -->
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-core_2.10</artifactId>
        <version>1.3.1</version>
    </dependency>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-sql_2.10</artifactId>
        <version>1.3.1</version>
    </dependency>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-hive_2.10</artifactId>
        <version>1.3.1</version>
    </dependency>
    <dependency>
        <groupId>com.databricks</groupId>
        <artifactId>spark-csv_2.10</artifactId>
        <version>1.4.0</version>
    </dependency>

下面是提交命令和我得到的coresponding错误日志:

Below are the submit commands and the coresponding error logs that I'm getting:

提交COMMAND1:

Submit Command1:

spark-submit --class working.path.to.Main \
    --master yarn \
    --deploy-mode cluster \
    --num-executors 17 \
    --executor-cores 8 \
    --executor-memory 25g \
    --driver-memory 25g \
    --num-executors 5 \
    application-with-all-dependencies.jar

错误LOG1:

User class threw exception: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient 

提交命令2:

spark-submit --class working.path.to.Main \
    --master yarn \
    --deploy-mode cluster \
    --num-executors 17 \
    --executor-cores 8 \
    --executor-memory 25g \
    --driver-memory 25g \
    --num-executors 5 \
    --files /etc/hive/conf/hive-site.xml \
    application-with-all-dependencies.jar

错误LOG2:

User class threw exception: java.lang.NumberFormatException: For input string: "5s" 

由于我没有管理权限,我不能修改配置。好吧,我可以联系到IT工程师并进行更改,但我要找的
的解决方案,包括在配置文件中,以下的变化,如果可能的!

Since I don't have the administrative permissions, I cannot modify the configuration. Well, I can contact to the IT engineer and make the changes, but I'm looking for the solution that involves less changes in the configuration files, if possible!

配置更改建议<一href=\"https://hadoopist.word$p$pss.com/2016/02/23/how-to-resolve-error-yarn-applicationmaster-user-class-threw-exception-java-lang-runtimeexception-java-lang-numberformatexception-for-input-string-5s-in-spark-submit/\"相对=nofollow>这里。

然后我尝试过各种各样的jar文件作为参数在其他论坛建议。

Then I tried passing various jar files as arguments as suggested in other discussion forums.

提交指令代码:

spark-submit --class working.path.to.Main \
    --master yarn \
    --deploy-mode cluster \
    --num-executors 17 \
    --executor-cores 8 \
    --executor-memory 25g \
    --driver-memory 25g \
    --num-executors 5 \
    --jars /usr/hdp/2.3.0.0-2557/spark/lib/datanucleus-api-jdo-3.2.6.jar,/usr/hdp/2.3.0.0-2557/spark/lib/datanucleus-core-3.2.10.jar,/usr/hdp/2.3.0.0-2557/spark/lib/datanucleus-rdbms-3.2.9.jar \
    --files /etc/hive/conf/hive-site.xml \
    application-with-all-dependencies.jar

错误LOG3:

User class threw exception: java.lang.NumberFormatException: For input string: "5s" 

我不知道用下面的命令发生了什么事,不能分析错误日志。

I didn't understood what happened with the following command and couldn't analyze the error log.

提交Command4:

Submit Command4:

spark-submit --class working.path.to.Main \
    --master yarn \
    --deploy-mode cluster \
    --num-executors 17 \
    --executor-cores 8 \
    --executor-memory 25g \
    --driver-memory 25g \
    --num-executors 5 \
    --jars /usr/hdp/2.3.0.0-2557/spark/lib/*.jar \
    --files /etc/hive/conf/hive-site.xml \
    application-with-all-dependencies.jar

提交LOG4:

Application application_1461686223085_0014 failed 2 times due to AM Container for appattempt_1461686223085_0014_000002 exited with exitCode: 10
For more detailed output, check application tracking page:http://cluster-host:XXXX/cluster/app/application_1461686223085_0014Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_e10_1461686223085_0014_02_000001
Exit code: 10
Stack trace: ExitCodeException exitCode=10:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
at org.apache.hadoop.util.Shell.run(Shell.java:456)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 10
Failing this attempt. Failing the application. 

任何其他可能的选择吗?任何形式的帮助将是非常美联社preciated。请让我知道如果你需要任何其他信息。

Any other possible options? Any kind of help will be highly appreciated. Please let me know if you need any other information.

感谢您。

推荐答案

该解决方案在<一个解释href=\"https://community.hortonworks.com/questions/5798/spark-hive-tables-not-found-when-running-in-yarn-c.html\"相对=nofollow>这里工作了我的情况。有两个位置蜂房的site.xml所在,可能会造成混乱。使用 - 文件/usr/hdp/current/spark-client/conf/hive-site.xml 而不是 - 文件/ etc /蜂巢/conf/hive-site.xml 。我没有要补充的罐子我的配置。希望这能帮助别人与类似的问题挣扎。谢谢你。

The solution explained in here worked for my case. There are two locations hive-site.xml resides that might be confusing. Use --files /usr/hdp/current/spark-client/conf/hive-site.xml instead of --files /etc/hive/conf/hive-site.xml. I didn't have to add the jars for my configuration. Hope this will help someone struggling with the similar problem. Thanks.

这篇关于阿帕奇星火的部署问题(集群模式)蜂巢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆