如何指定依赖复式火花提交 [英] How to specify mutliple dependencies for spark-submit

查看:172
本文介绍了如何指定依赖复式火花提交的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下的命令行启动火花流作业。

I have the following as the command line to start a spark streaming job.

    spark-submit --class com.biz.test \
            --packages \
                org.apache.spark:spark-streaming-kafka_2.10:1.3.0 \
                org.apache.hbase:hbase-common:1.0.0 \
                org.apache.hbase:hbase-client:1.0.0 \
                org.apache.hbase:hbase-server:1.0.0 \
                org.json4s:json4s-jackson:3.2.11 \
            ./test-spark_2.10-1.0.8.jar \
            >spark_log 2>&1 &

该作业无法启动并出现以下错误:

The job fails to start with the following error:

Exception in thread "main" java.lang.IllegalArgumentException: Given path is malformed: org.apache.hbase:hbase-common:1.0.0
    at org.apache.spark.util.Utils$.resolveURI(Utils.scala:1665)
    at org.apache.spark.deploy.SparkSubmitArguments.parse$1(SparkSubmitArguments.scala:432)
    at org.apache.spark.deploy.SparkSubmitArguments.parseOpts(SparkSubmitArguments.scala:288)
    at org.apache.spark.deploy.SparkSubmitArguments.<init>(SparkSubmitArguments.scala:87)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:105)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

我试过删除格式和返回单行线,但这并不解决问题。我也尝试了一堆的变化:不同的版本,添加 _2.10 来的的artifactId年底,等等

I've tried removing the formatting and returning to a single line, but that doesn't resolve the issue. I've also tried a bunch of variations: different versions, added _2.10 to the end of the artifactId, etc.

根据文档(火花提交--help

为坐标的格式应的groupId:artifactId的:版本

The format for the coordinates should be groupId:artifactId:version.

所以,我有什么应该是有效的,应该引用的这个包

So what I have should be valid and should reference this package.

如果有帮助,我跑了Cloudera 5.4.4。

If it helps, I'm running Cloudera 5.4.4.

我是什么做错了吗?我怎样才能正确地引用HBase的包?

What am I doing wrong? How can I reference the hbase packages correctly?

推荐答案

软件包列表应使用不带空格的逗号(破线应该只是正常工作)为例分开

A list of packages should be separated using commas without whitespaces (breaking lines should work just fine) for example

--packages  org.apache.spark:spark-streaming-kafka_2.10:1.3.0,\
  org.apache.hbase:hbase-common:1.0.0

这篇关于如何指定依赖复式火花提交的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆