通过 RStudio 加载 com.databricks.spark.csv [英] Loading com.databricks.spark.csv via RStudio

查看:41
本文介绍了通过 RStudio 加载 com.databricks.spark.csv的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经安装了 Spark-1.4.0.我还安装了它的 R 包 SparkR,我可以通过 Spark-shell 和 RStudio 使用它,但是,我无法解决一个差异.

I have installed Spark-1.4.0. I have also installed its R package SparkR and I am able to use it via Spark-shell and via RStudio, however, there is one difference I can not solve.

启动 SparkR-shell 时

When launching the SparkR-shell

./bin/sparkR --master local[7] --packages com.databricks:spark-csv_2.10:1.0.3

我可以按如下方式读取 .csv 文件

I can read a .csv-file as follows

flights <- read.df(sqlContext, "data/nycflights13.csv", "com.databricks.spark.csv", header="true")

不幸的是,当我通过 RStudio 启动 SparkR(正确设置我的 SPARK_HOME)时,我收到以下错误消息:

Unfortunately, when I start SparkR via RStudio (correctly setting my SPARK_HOME) I get the following error message:

15/06/16 16:18:58 ERROR RBackendHandler: load on 1 failed
Caused by: java.lang.RuntimeException: Failed to load class for data source: com.databricks.spark.csv

我知道我应该以某种方式加载 com.databricks:spark-csv_2.10:1.0.3,但我不知道该怎么做.有人可以帮我吗?

I know I should load com.databricks:spark-csv_2.10:1.0.3 in a way, but I have no idea how to do this. Could someone help me?

推荐答案

这是正确的语法(经过数小时的尝试):(注意 - 您必须专注于第一行.注意双引号)

This is the right syntax (after hours of trying): (Note - You've to focus on the first line. Notice to double-quotes)

Sys.setenv('SPARKR_SUBMIT_ARGS'='"--packages" "com.databricks:spark-csv_2.10:1.0.3" "sparkr-shell"')

library(SparkR)
library(magrittr)

# Initialize SparkContext and SQLContext
sc <- sparkR.init(appName="SparkR-Flights-example")
sqlContext <- sparkRSQL.init(sc)


# The SparkSQL context should already be created for you as sqlContext
sqlContext
# Java ref type org.apache.spark.sql.SQLContext id 1

# Load the flights CSV file using `read.df`. Note that we use the CSV reader Spark package here.
flights <- read.df(sqlContext, "nycflights13.csv", "com.databricks.spark.csv", header="true")

这篇关于通过 RStudio 加载 com.databricks.spark.csv的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆