SparkR的安装 [英] Installing of SparkR

查看:32
本文介绍了SparkR的安装的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有 R 的最新版本 - 3.2.1.现在我想在 R 上安装 SparkR.执行后:

<代码>>install.packages("SparkR")

我回来了:

安装包到‘/home/user/R/x86_64-pc-linux-gnu-library/3.2’(因为'lib'未指定)install.packages 中的警告:SparkR"包不可用(适用于 R 版本 3.2.1)

我的机器上也安装了 Spark

Spark 1.4.0

我该如何解决这个问题?

解决方案

您可以直接从 GitHub 存储库安装:

if (!require('devtools')) install.packages('devtools')devtools::install_github('apache/spark@v2.x.x', subdir='R/pkg')

您应该选择与您使用的 Spark 版本相对应的标签(上面的v2.x.x).您可以在 项目页面 或直接从 R 使用 GitHub API:

jsonlite::fromJSON("https://api.github.com/repos/apache/spark/tags")$name

如果您从 下载页面下载了二进制包,R 库位于 R/lib/SparkR 子目录.可用于直接安装SparkR.例如:

$ export SPARK_HOME=/path/to/spark/directory$ cd $SPARK_HOME/R/pkg/$ R -e "devtools::install('.')"

您还可以将 R 库添加到 .libPaths(取自 此处):

Sys.setenv(SPARK_HOME='/path/to/spark/directory').libPaths(c(file.path(Sys.getenv('SPARK_HOME'), 'R', 'lib'), .libPaths()))

最后,您可以使用 sparkR shell 而无需任何额外步骤:

$/path/to/spark/directory/bin/sparkR

编辑

根据 Spark 2.1.0 发行说明 将来应该可以在 CRAN 上使用:

<块引用>

使用 Apache Spark 版本构建的独立可安装包.我们很快就会将此提交给 CRAN.

您可以关注 SPARK-15799 查看进度.>

编辑 2

虽然 SPARK-15799 已合并,但事实证明满足 CRAN 要求具有挑战性(参见关于 2.2.2, 2.3.1, 2.4.0),并且这些包随后被删除(参见例如 SparkR 已于 2018 年 5 月 1 日从 CRAN 中删除CRAN SparkR 包已删除?).由于原帖中列出的结果方法仍然是最可靠的解决方案.

编辑 3

好的,SparkR 再次支持 CRAN,v2.4.1.install.packages('SparkR') 应该可以再次工作(镜像可能需要几天时间才能反映出来)

I have the last version of R - 3.2.1. Now I want to install SparkR on R. After I execute:

> install.packages("SparkR")

I got back:

Installing package into ‘/home/user/R/x86_64-pc-linux-gnu-library/3.2’
(as ‘lib’ is unspecified)
Warning in install.packages :
  package ‘SparkR’ is not available (for R version 3.2.1)

I have also installed Spark on my machine

Spark 1.4.0

How I can solve this problem?

解决方案

You can install directly from a GitHub repository:

if (!require('devtools')) install.packages('devtools')
devtools::install_github('apache/spark@v2.x.x', subdir='R/pkg')

You should choose tag (v2.x.x above) corresponding to the version of Spark you use. You can find a full list of tags on the project page or directly from R using GitHub API:

jsonlite::fromJSON("https://api.github.com/repos/apache/spark/tags")$name

If you've downloaded binary package from a downloads page R library is in a R/lib/SparkR subdirectory. It can be used to install SparkR directly. For example:

$ export SPARK_HOME=/path/to/spark/directory
$ cd $SPARK_HOME/R/pkg/
$ R -e "devtools::install('.')"

You can also add R lib to .libPaths (taken from here):

Sys.setenv(SPARK_HOME='/path/to/spark/directory')
.libPaths(c(file.path(Sys.getenv('SPARK_HOME'), 'R', 'lib'), .libPaths()))

Finally, you can use sparkR shell without any additional steps:

$ /path/to/spark/directory/bin/sparkR

Edit

According to Spark 2.1.0 Release Notes should be available on CRAN in the future:

Standalone installable package built with the Apache Spark release. We will be submitting this to CRAN soon.

You can follow SPARK-15799 to check the progress.

Edit 2

While SPARK-15799 has been merged, satisfying CRAN requirements proved to be challenging (see for example discussions about 2.2.2, 2.3.1, 2.4.0), and the packages has been subsequently removed (see for example SparkR was removed from CRAN on 2018-05-01, CRAN SparkR package removed?). As the result methods listed in the original post are still the most reliable solutions.

Edit 3

OK, SparkR is back up on CRAN again, v2.4.1. install.packages('SparkR') should work again (it may take a couple of days for the mirrors to reflect this)

这篇关于SparkR的安装的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆