Spark 2.0.0:SparkR CSV 导入 [英] Spark 2.0.0: SparkR CSV Import

查看:21
本文介绍了Spark 2.0.0:SparkR CSV 导入的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将 csv 文件读入 SparkR(运行 Spark 2.0.0) - &尝试尝试新添加的功能.

在此处使用 RStudio.

我在读取"源文件时遇到错误.

我的代码:

Sys.setenv(SPARK_HOME = "C:/spark-2.0.0-bin-hadoop2.6")库(SparkR,lib.loc = c(file.path(Sys.getenv(SPARK_HOME"),R",lib")))sparkR.session(master = "local[*]", appName = "SparkR")df <- loadDF("F:/file.csv", "csv", header = "true")

我在 loadDF 函数处遇到错误.

错误:

loadDF("F:/file.csv", "csv", header = "true")

<块引用>

invokeJava 中的错误(isStatic = TRUE, className, methodName, ...):java.lang.reflect.InvocationTargetException在 sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)在 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)在 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)在 java.lang.reflect.Constructor.newInstance(Constructor.java:422)在 org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:258)在 org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:359)在 org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:263)在 org.apache.spark.sql.hive.HiveSharedState.metadataHive$lzycompute(HiveSharedState.scala:39)在 org.apache.spark.sql.hive.HiveSharedState.metadataHive(HiveSharedState.scala:38)在 org.apache.spark.sql.hive.HiveSharedState.externalCatalog$lzycompute(HiveSharedState.scala:46)在 org.apache.spark.sql.hive.HiveSharedSt

我在这里遗漏了一些规范吗?任何继续进行的指示将不胜感激.

解决方案

我有同样的问题.但是这个简单的代码有类似的问题

createDataFrame(iris)

可能是安装有问题?

更新.是的 !我找到了解决方案.

此解决方案基于此:带有 DataFrame API 的 Apache Spark MLlib 在 createDataFrame() 或 read().csv(...) 时给出 java.net.URISyntaxException

对于 R,只需通过以下代码启动会话:

sparkR.session(sparkConfig = list(spark.sql.warehouse.dir="/file:C:/temp"))

I am trying to read a csv file into SparkR (running Spark 2.0.0) - & trying to experiment with the newly added features.

Using RStudio here.

I am getting an error while "reading" the source file.

My code:

Sys.setenv(SPARK_HOME = "C:/spark-2.0.0-bin-hadoop2.6")
library(SparkR, lib.loc = c(file.path(Sys.getenv("SPARK_HOME"), "R", "lib")))
sparkR.session(master = "local[*]", appName = "SparkR")
df <- loadDF("F:/file.csv", "csv", header = "true")

I get an error at at the loadDF function.

The error:

loadDF("F:/file.csv", "csv", header = "true")

Error in invokeJava(isStatic = TRUE, className, methodName, ...) : java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:422) at org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:258) at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:359) at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:263) at org.apache.spark.sql.hive.HiveSharedState.metadataHive$lzycompute(HiveSharedState.scala:39) at org.apache.spark.sql.hive.HiveSharedState.metadataHive(HiveSharedState.scala:38) at org.apache.spark.sql.hive.HiveSharedState.externalCatalog$lzycompute(HiveSharedState.scala:46) at org.apache.spark.sql.hive.HiveSharedSt

Am I missing some specification here? Any pointers to proceed would be appreciated.

解决方案

I have the same problem. But similary problem with this simple code

createDataFrame(iris)

May be some wrong in installation ?

UPD. YES ! I find solution.

This solution based on this: Apache Spark MLlib with DataFrame API gives java.net.URISyntaxException when createDataFrame() or read().csv(...)

For R just start session by this code:

sparkR.session(sparkConfig = list(spark.sql.warehouse.dir="/file:C:/temp"))

这篇关于Spark 2.0.0:SparkR CSV 导入的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆