使用笔记本时,将jar添加到pyspark [英] Add jar to pyspark when using notebook

查看:800
本文介绍了使用笔记本时,将jar添加到pyspark的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将mongodb hadoop与spark集成,但无法弄清楚如何让IPy可以访问IPython笔记本。

I'm trying the mongodb hadoop integration with spark but can't figure out how to make the jars accessible to an IPython notebook.

这就是我的意思试图这样做:

Here what I'm trying to do:

# set up parameters for reading from MongoDB via Hadoop input format
config = {"mongo.input.uri": "mongodb://localhost:27017/db.collection"}
inputFormatClassName = "com.mongodb.hadoop.MongoInputFormat"

# these values worked but others might as well
keyClassName = "org.apache.hadoop.io.Text"
valueClassName = "org.apache.hadoop.io.MapWritable"

# Do some reading from mongo
items = sc.newAPIHadoopRDD(inputFormatClassName, keyClassName, valueClassName, None, None, config)

当我使用以下命令在pyspark中启动它时,此代码正常工作:

This code works fine when I launch it in pyspark using the following command:


spark-1.4.1 / bin / pyspark --jars' mongo-hadoop-core-1.4.0.jar,mongo-java-driver-3.0.2.jar'

spark-1.4.1/bin/pyspark --jars 'mongo-hadoop-core-1.4.0.jar,mongo-java-driver-3.0.2.jar'

其中 mongo-hadoop-core-1.4.0.jar mongo-java-driver-2.10.1。 jar 允许从java使用mongodb。但是,当我这样做时:

where mongo-hadoop-core-1.4.0.jar and mongo-java-driver-2.10.1.jar allows using mongodb from java. However, when I do this:


IPYTHON_OPTS =notebookspark-1.4.1 / bin / pyspark --jars'mongo-hadoop -core-1.4.0.jar,mongo-java-driver-3.0.2.jar'

IPYTHON_OPTS="notebook" spark-1.4.1/bin/pyspark --jars 'mongo-hadoop-core-1.4.0.jar,mongo-java-driver-3.0.2.jar'

这些罐子不再可用了我收到以下错误:

The jars are not available anymore and I get the following error:


java.lang.ClassNotFoundException:com.mongodb.hadoop.MongoInputFormat

java.lang.ClassNotFoundException: com.mongodb.hadoop.MongoInputFormat

有谁知道如何让IPy可用于IPython笔记本中的火花?我很确定这不是特定于mongo所以也许有人已经成功地在使用笔记本时将jar添加到类路径中了?

Does anyone know how to make jars available to the spark in the IPython notebook? I'm pretty sure this is not specific to mongo so maybe someone already has succeeded in adding jars to the classpath while using the notebook?

推荐答案

非常相似,请告诉我这是否有帮助:
https:/ /issues.apache.org/jira/browse/SPARK-5185

Very similar, please let me know if this helps: https://issues.apache.org/jira/browse/SPARK-5185

这篇关于使用笔记本时,将jar添加到pyspark的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆