RStudio连接到远程Hadoop服务器 [英] RStudio to connect to remote Hadoop server

查看:667
本文介绍了RStudio连接到远程Hadoop服务器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一台安装了Rstudio的Ubuntu桌面,我也有一个远程hadoop集群,我希望从RStudio连接到Centos下运行,从我的理解这是一种可行的方法,但有人可以证实这一点吗?

解决方案

Rstudio不允许您连接到hadoop,但您可以使用hadoop streaming API提交您的hadoop作业。

有几个软件包可以帮助您入门。我已经使用rmr在hadoop集群上使用流api运行map / reduce作业。这些可以在这里找到。



https:// github.com/RevolutionAnalytics/RHadoop/wiki



还有一个rhipe软件包,可以让你在R脚本中与hdfs文件系统进行通信。



http://www.datadr。 org / doc / functions.html


I have an Ubuntu desktop with Rstudio on, I also have a remote hadoop cluster running under Centos that I hope to connect to from RStudio, from my understanding this is a viable method but can someone please confirm this?

解决方案

Rstudio will not allow you to connect to hadoop but you can use the hadoop streaming api to submit your hadoop jobs.

There are a few packages to help you get started. I have used rmr to run map/reduce jobs on a hadoop cluster with the streaming api. Those can be found here.

https://github.com/RevolutionAnalytics/RHadoop/wiki

There is also the rhipe package which will allow you to communicate with the hdfs file system inside your R scripts.

http://www.datadr.org/doc/functions.html

这篇关于RStudio连接到远程Hadoop服务器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆