火花提交纱线集群--jars不起作用？ [英] spark-submit yarn-cluster with --jars does not work?

查看：273 发布时间：2016/5/22 16:05:12 java hadoop apache-spark yarn cloudera-cdh

本文介绍了火花提交纱线集群--jars不起作用？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想通过以下命令提交火花作业到CDH纱线集群

I am trying to submit a spark job to the CDH yarn cluster via the following commands

我尝试了好几种的组合，这一切没有工作...
我现在都位于两个我的本地/根的POI罐子，以及HDFS /用户/根/ lib中，所以我曾尝试以下

I have tried several combinations and it all does not work... I now have all the poi jars located in both my local /root, as well as HDFS /user/root/lib, hence I have tried the following

spark-submit --master yarn-cluster --class "ReadExcelSC" ./excel_sc.jar --jars /root/poi-3.12.jars, /root/poi-ooxml-3.12.jar, /root/poi-ooxml-schemas-3.12.jar

spark-submit --master yarn-cluster --class "ReadExcelSC" ./excel_sc.jar --jars file:/root/poi-3.12.jars, file:/root/poi-ooxml-3.12.jar, file:/root/poi-ooxml-schemas-3.12.jar

spark-submit --master yarn-cluster --class "ReadExcelSC" ./excel_sc.jar --jars hdfs://mynamenodeIP:8020/user/root/poi-3.12.jars,hdfs://mynamenodeIP:8020/user/root/poi-ooxml-3.12.jar,hdfs://mynamenodeIP:8020/user/root/poi-ooxml-schemas-3.12.jar

我如何传播完成的罐子到所有群集节点？因为没有上述的工作，而工作仍然不知何故没有得到引用类，因为我不断收到同样的错误：

How do I propogate the jars to all cluster nodes? because none of the above is working, and the job still somehow does not get to reference the class, as I keep getting the same error:

java.lang.NoClassDefFoundError: org/apache/poi/ss/usermodel/WorkbookFactory

在相同的命令作品有--master当地的，不指定--jars，因为我抄我的罐子到/ opt / Cloudera的/包裹/ CDH / lib中/火花/ lib目录。

The same command works with "--master local", without specifying the --jars, as I have copied my jars to /opt/cloudera/parcels/CDH/lib/spark/lib.

不过纱线群集模式，我需要外部罐子分发到所有群集，但上面code不起作用。

However for yarn-cluster mode, I would need to distribute the external jars to all cluster, but the above code does not work.

鸭preciate您的帮助，谢谢。

Appreciate your help, thanks.

P.S。我使用CDH5.4.2火花1.3.0

p.s. I am using CDH5.4.2 with spark 1.3.0

推荐答案

据以帮助星火选项提交

- 罐子包括本地罐子包括司机和执行人的classpath。 [它只是设置的路径]

--jars includes the local jars to include on the driver and executor classpaths. [it will just set the path]

---文件您器件的应用运行到执行人节点的所有的工作目录[它将运输您的罐子结果将需要复制的罐子
工作目录]

---files will copy the jars needed for you appication to run to all the working dir of executor nodes [it will transport your jar to
working dir]

注意：这类似于-file在Hadoop的数据流，运的映射/减速脚本到从属节点选项

Note: This is similar to -file options in hadoop streaming , which transports the mapper/reducer scripts to slave nodes.

因此，与--files选项尝试为好。

So try with --files options as well.

$ spark-submit --help
Options:
  --jars JARS                 Comma-separated list of local jars to include on the driver
                              and executor classpaths.
  --files FILES               Comma-separated list of files to be placed in the working
                              directory of each executor.

希望这有助于

这篇关于火花提交纱线集群--jars不起作用？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

火花提交纱线集群--jars不起作用？ [英] spark-submit yarn-cluster with --jars does not work?

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

火花提交纱线集群--jars不起作用？ [英] spark-submit yarn-cluster with --jars does not work?

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭