如何在yarn客户端模式下在远程主节点上提交spark作业? [英] How to submit a spark job on a remote master node in yarn client mode?

查看：27 发布时间：2021/12/15 19:10:12 hadoop apache-spark cluster-computing hadoop-yarn

本文介绍了如何在yarn客户端模式下在远程主节点上提交spark作业?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我需要将 Spark 应用程序/作业提交到远程 Spark 集群.我目前在我的机器上有火花，主节点的 IP 地址作为纱线客户端.顺便说一句，我的机器不在集群中.我用这个命令提交我的工作

I need to submit spark apps/jobs onto a remote spark cluster. I have currently spark on my machine and the IP address of the master node as yarn-client. Btw my machine is not in the cluster. I submit my job with this command

./spark-submit --class SparkTest --deploy-mode client /home/vm/app.jar

我将我的主人的地址硬编码到我的应用程序中

I have the address of my master hardcoded into my app in the form

val spark_master = spark://IP:7077

然而我得到的只是错误

16/06/06 03:04:34 INFO AppClient$ClientEndpoint: Connecting to master spark://IP:7077...
16/06/06 03:04:34 WARN AppClient$ClientEndpoint: Failed to connect to master IP:7077
java.io.IOException: Failed to connect to /IP:7077
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:216)
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:167)
at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:200)
at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:187)
at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:183)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Connection refused: /IP:7077

或者如果我使用

./spark-submit --class SparkTest --master yarn --deploy-mode client /home/vm/test.jar

我明白

Exception in thread "main" java.lang.Exception: When running with master 'yarn' either HADOOP_CONF_DIR or YARN_CONF_DIR must be set in the environment.
at org.apache.spark.deploy.SparkSubmitArguments.validateSubmitArguments(SparkSubmitArguments.scala:251)
at org.apache.spark.deploy.SparkSubmitArguments.validateArguments(SparkSubmitArguments.scala:228)
at org.apache.spark.deploy.SparkSubmitArguments.<init>(SparkSubmitArguments.scala:109)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:114)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

我真的需要在我的工作站中也配置 hadoop 吗?所有工作都将远程完成，这台机器不是集群的一部分.我正在使用 Spark 1.6.1.

Do I really need to have hadoop configured as well in my workstation? All the work will be done remotely and this machine is not part of the cluster. I am using Spark 1.6.1.

如何在yarn客户端模式下在远程主节点上提交spark作业? [英] How to submit a spark job on a remote master node in yarn client mode?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何在yarn客户端模式下在远程主节点上提交spark作业? [英] How to submit a spark job on a remote master node in yarn client mode?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭