包括自动罐子PySpark类路径 [英] Automatically including jars to PySpark classpath

查看:262
本文介绍了包括自动罐子PySpark类路径的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图自动包括罐子我PySpark类路径。现在我可以键入以下命令和它的作品:

  $ pyspark --jars /path/to/my.jar

我想有默认包含的罐子,这样我只能输入 pyspark 以及在IPython的笔记本电脑中使用它。

我读过,我可以通过在ENV设置PYSPARK_SUBMIT_ARGS包括参数:

 出口PYSPARK_SUBMIT_ARGS = - 罐子/path/to/my.jar

不幸的是,上述方法无效。我得到的运行时错误无法加载数据源

运行星火1.3.1。

修改

我的解决方法使用IPython的笔记本电脑时,如下:

  $ IPYTHON_OPTS =记事本pyspark --jars /path/to/my.jar


解决方案

您可以在火花defaults.conf 的文件(位于火花安装的conf文件夹)添加的jar文件。如果在坛子里列表中有多个条目,请使用:作为分隔符

  spark.driver.extraClassPath /path/to/my.jar

这个属性是在<一个记录href=\"https://spark.apache.org/docs/1.3.1/configuration.html#runtime-environment\">https://spark.apache.org/docs/1.3.1/configuration.html#runtime-environment

I'm trying to automatically include jars to my PySpark classpath. Right now I can type the following command and it works:

$ pyspark --jars /path/to/my.jar

I'd like to have that jar included by default so that I can only type pyspark and also use it in IPython Notebook.

I've read that I can include the argument by setting PYSPARK_SUBMIT_ARGS in env:

export PYSPARK_SUBMIT_ARGS="--jars /path/to/my.jar"

Unfortunately the above doesn't work. I get the runtime error Failed to load class for data source.

Running Spark 1.3.1.

Edit

My workaround when using IPython Notebook is the following:

$ IPYTHON_OPTS="notebook" pyspark --jars /path/to/my.jar

解决方案

You can add the jar files in the spark-defaults.conf file (located in the conf folder of your spark installation). If there is more than one entry in the jars list, use : as separator.

spark.driver.extraClassPath /path/to/my.jar

This property is documented in https://spark.apache.org/docs/1.3.1/configuration.html#runtime-environment

这篇关于包括自动罐子PySpark类路径的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆