SparkR和包 [英] SparkR and Packages

查看:207
本文介绍了SparkR和包的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何被用于与R数据业务从火花一个呼叫包?

How do one call packages from spark to be utilized for data operations with R?

例如我试图访问我的test.csv在HDFS如下

example i am trying to access my test.csv in hdfs as below

Sys.setenv(SPARK_HOME="/opt/spark14")
library(SparkR)
sc <- sparkR.init(master="local")
sqlContext <- sparkRSQL.init(sc)
flights <- read.df(sqlContext,"hdfs://sandbox.hortonWorks.com:8020     /user/root/test.csv","com.databricks.spark.csv", header="true")

但得到如下错误:

but getting error as below:

Caused by: java.lang.RuntimeException: Failed to load class for data  source: com.databricks.spark.csv

我试着以下选项加载CSV包

i tried loading the csv package by below option

Sys.setenv('SPARKR_SUBMIT_ARGS'='--packages com.databricks:spark-csv_2.10:1.0.3')

但加载sqlContext时提示以下错误:

but getting the below error during loading sqlContext

Launching java with spark-submit command /opt/spark14/bin/spark-submit   --packages com.databricks:spark-csv_2.10:1.0.3 /tmp/RtmpuvwOky  /backend_port95332e5267b 
Error: Cannot load main class from JAR file:/tmp/RtmpuvwOky/backend_port95332e5267b

任何帮助将是非常美联社preciated。

Any help will be highly appreciated.

推荐答案

所以看起来通过设置 SPARKR_SUBMIT_ARGS 要覆盖默认值,即 sparkr壳。你也许可以做同样的事情,只是追加sparkr壳您SPARKR_SUBMIT_ARGS的结束。相比取决于所以我创建了一个JIRA罐子来跟踪这个问题(我会尝试修复,如果SparkR人同意我的看法)的 https://issues.apache.org/jira/browse/SPARK-8506

So it looks like by setting SPARKR_SUBMIT_ARGS you are overriding the default value, which is sparkr-shell. You could probably do the same thing and just append sparkr-shell to the end of your SPARKR_SUBMIT_ARGS. This is seems unnecessarily complex compared to depending on jars so I've created a JIRA to track this issue (and I'll try and a fix if the SparkR people agree with me) https://issues.apache.org/jira/browse/SPARK-8506 .

注:另一种选择是使用sparkr命令+ - 包com.databricks:火花csv_2.10:1.0.3 ,因为这应该工作

Note: another option would be using the sparkr command + --packages com.databricks:spark-csv_2.10:1.0.3 since that should work.

这篇关于SparkR和包的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆