如何加载在IPython的笔记本罐子dependenices [英] How to load jar dependenices in IPython Notebook
问题描述
<一个href=\"https://medium.com/@chris_bour/6-differences-between-pandas-and-spark-dataframes-1380cec394d2#.85lrap56d\"相对=nofollow>此页面是鼓励我尝试火花CSV在Pyspark读.csv文件
我发现一对夫妇的职位,如<一个href=\"http://stackoverflow.com/questions/30757439/how-to-add-any-new-library-like-spark-csv-in-apache-spark-$p$pbuilt-version\">this描述如何使用火花CSV
This page was inspiring me to try out spark-csv for reading .csv file in Pyspark I found a couple of posts such as this describing how to use spark-csv
但我不能够通过包括在初创可以通过火花壳做的.jar文件或程序包扩展名来初始化IPython的实例。
But I am not able to initialize the ipython instance by including either the .jar file or package extension in the start-up that could be done through spark-shell.
也就是说,不是 IPython的笔记本--profile = pyspark
,我尝试了 IPython的笔记本--profile = pyspark --packages融为一体。 databricks:火花csv_2.10:1.0.3
,但它不支持。
That is, instead of ipython notebook --profile=pyspark
, I tried out ipython notebook --profile=pyspark --packages com.databricks:spark-csv_2.10:1.0.3
but it is not supported.
请指教。
推荐答案
您可以简单地把它传递在 PYSPARK_SUBMIT_ARGS
变量。例如:
You can simply pass it in the PYSPARK_SUBMIT_ARGS
variable. For example:
export PACKAGES="com.databricks:spark-csv_2.11:1.3.0"
export PYSPARK_SUBMIT_ARGS="--packages ${PACKAGES} pyspark-shell"
这篇关于如何加载在IPython的笔记本罐子dependenices的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!