如何在DSX上将Spark软件包添加到Spark R Notebook? [英] How to add spark packages to Spark R notebook on DSX?
问题描述
spark文档显示了spark程序包可以如何添加:
The spark documentation shows how a spark package can be added:
sparkR.session(sparkPackages = "com.databricks:spark-avro_2.11:3.0.0")
我相信只能在初始化会话时使用。
I believe this can only be used when initialising the session.
如何使用DSX上的笔记本为SparkR添加火花包?
How can we add spark packages for SparkR using a notebook on DSX?
推荐答案
请使用pixiedust软件包管理器来安装avro软件包。
Please use pixiedust package manager to install the avro package.
pixiedust.installPackage( com.databricks:spark-avro_2.11 :3.0.0)
http://datascience.ibm.com/docs/content/analyze-data/Package-Manager.html
因为pixiedust可在python中导入,所以从python 1.6内核安装它(请记住这是在您的spark实例级别安装的)。
安装后,重新启动内核,然后切换到R内核,然后像这样读取avro:-
Install it from python 1.6 kernel since pixiedust is importable in python.(Remember this is install at your spark instance level). Once you install it , restart the kernel and then switch to R kernel and then read the avro like this:-
df1< ;-read.df( episodes.avro,source = com.databricks.spark.avro,header = true)
head(df1)
完整笔记本:-
https://github.com/charles2588/bluemixsparknotebooks/raw /master/R/sparkRPackageTest.ipynb
谢谢,查尔斯
。
Thanks, Charles.
这篇关于如何在DSX上将Spark软件包添加到Spark R Notebook?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!