Spark-Shell:如何定义JAR加载顺序 [英] Spark-Shell: Howto define JAR loading order

查看:170
本文介绍了Spark-Shell:如何定义JAR加载顺序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在本地运行spark-shell +定义一些第三方JAR的类路径:

$ spark-shell --driver-class-path /Myproject/LIB/*

在外壳中,我键入了

scala> import com.google.common.collect.Lists
<console>:19: error: object collect is not a member of package com.google.common
   import com.google.common.collect.Lists
                            ^

我想Spark首先加载了/usr/local/spark-1.4.0-bin-hadoop2.6/lib/ spark-assembly-1.4.0-hadoop2.6.0.jar ,它没有不包含com.google.common.collect程序包.

/Myproject/LIB/包含 google-collections-1.0.jar ,并具有com.google.common.collect.但是,这个jar似乎被忽略了.

问题:如何告诉spark-shell在--driver-class-path中先加载spark-1.4.0-bin-hadoop2.6/lib/中的JAR?

答案:结合Sathish和Holden评论中的提示
必须使用--jars代替--driver-class-path.必须指定所有jar文件.罐子必须用逗号分隔,没有空格(根据spark-shell --help)

$ spark-shell --jars $(echo ./Myproject/LIB/*.jar | tr ' ' ',')

解决方案

驱动程序类路径标志需要用逗号分隔.因此,基于在Java类路径中设置多个jar ,我们可以尝试

Running spark-shell locally + define classpath to some 3rd party JARs:

$ spark-shell --driver-class-path /Myproject/LIB/*

Within the shell, I typed

scala> import com.google.common.collect.Lists
<console>:19: error: object collect is not a member of package com.google.common
   import com.google.common.collect.Lists
                            ^

I suppose Spark has loaded first /usr/local/spark-1.4.0-bin-hadoop2.6/lib/spark-assembly-1.4.0-hadoop2.6.0.jar which doesn't contain the com.google.common.collect package.

/Myproject/LIB/ contains google-collections-1.0.jar and has the com.google.common.collect. However, this jar seems to be ignored.

Question: How to tell spark-shell to load the JARs in --driver-class-path before those in spark-1.4.0-bin-hadoop2.6/lib/ ?

ANSWER: combining hints from Sathish and Holden comments
--jars must be used instead of --driver-class-path. All jar files must be specified. The jars must be comma-delimited, no space (as per spark-shell --help)

$ spark-shell --jars $(echo ./Myproject/LIB/*.jar | tr ' ' ',')

解决方案

The driver class path flag needs to be comma separated. So ,based on Setting multiple jars in java classpath , we can try spark-shell --driver-class-path $(echo ./Myproject/LIB/*.jar | tr ' ' ',')

这篇关于Spark-Shell:如何定义JAR加载顺序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆