R Parallel-连接到远程内核 [英] R Parallel - connecting to remote cores

查看:106
本文介绍了R Parallel-连接到远程内核的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在Windows 7上的R 2.14.1中工作

Working in R 2.14.1, on Windows 7

在R中使用并行程序包,我试图利用网络上可用的本地计算机之外的内核,我连接到的所有远程主机都是相同的Windows计算机.

Using the package parallel in R, I'm trying to take advantage of cores outside of my local machine available on my network, where all remote hosts I am connecting to are identical Windows machines.

命令的基本形式就是建立连接.

The basic form of the commands are as such to make the connection.

library(parallel)
#assume 8 cores per machine
cl<-makePSOCKcluster(c(rep("localhost", 8), rep("otherhost", 8)))

当然,尝试调试这些东西可能很棘手,但这就是我要解决的问题.

Of course, trying to debug these things can be pretty tricky, but here is where I'm at with it.

如果我按如下所示指定manual = TRUE标志

If I specify the manual = TRUE flag as below

cl<-makePSOCKcluster(c(rep("localhost", 8), rep("otherhost", 8)), manual=TRUE)

连接到远程主机并运行并行进程没有问题.这些计算机的设置与我正在使用的设置相同.但是,如果未设置此手动标志,则连接命令会挂起.

there are no problems connecting to the remote host, and running a parallel process. The computers have identical setups to the one that I am working on. Yet, when this manual flag is not set, the connection command hangs.

这似乎向我表明,由于手动标志绕过ssh来建立与主机的连接,所以当manual = FALSE时,ssh就是问题.

This seems to indicate to me that since the manual flag bypasses ssh to make the connection to the host, that ssh is the problem when manual=FALSE.

目前不能保证远程计算机上装有ssh.问题是,考虑到我具有远程主机的所有相关Windows登录信息,并且无法更改远程计算机上的设置,我如何在不使用R 并行程序包的情况下连接到远程计算机上的内核? 是否指定manual = true?

It is not guaranteed at the moment that the remote computers have ssh on them. The question is, given that I have all the pertinent windows login information for my remote hosts, and that I cannot change the settings on the remote computers, how would I connect to cores on remote machines with the package parallel in R without specifying manual = true?

或者,如果必须安装ssh来完成此操作,则假定所有计算机上都装有ssh.如何在不绕过ssh的情况下连接到远程计算机 上的内核?

Alternatively, if ssh must be installed for this to happen, let's assume all computers have ssh on them. How would I connect to cores on the remote machines without circumventing ssh?

如果您需要更多信息,请告诉我,谢谢您.

If you need any more information please let me know, I appreciate the time.

8-26-14

感谢史蒂夫·韦斯顿的见解.我将提供有关使用的确切工具和设置的更新,以便在系统启动并运行时使其正常运行.

Thanks to Steve Weston for his insights. I will provide an update with the exact tools and setup I use to get my system working when it's up and running.

如果您有其他什么要添加的最佳途径,请随意发表评论或发表,这是通过makePSOCKcluster从Windows机器远程连接到Windows机器的最佳途径,其中手动标记设置为FALSE. /p>

Feel free to comment or post if you have anything else to add as to what may be the best route to go in remote connecting to a windows machine from a windows machine via makePSOCKcluster, where the manual flag is set to FALSE.

推荐答案

使用manual=FALSE创建PSOCK群集时,在远程计算机上启动工作程序的唯一方法是使用"ssh","rsh"或类似的东西.命令行兼容,例如PuTTY中的"plink".原因是makePSOCKcluster使用系统"功能启动远程工作者以执行以下形式的命令:

When creating a PSOCK cluster with manual=FALSE, the only way to start a worker on a remote machine is with "ssh", "rsh", or something command-line compatible, such as "plink" from PuTTY. The reason is that makePSOCKcluster starts the remote workers using the "system" function to execute commands of the form:

ssh -l user otherhost '/usr/lib/R/bin/Rscript' -e 'parallel:::.slaveRSOCK()' MASTER=myhost PORT=10187 OUT=/dev/null TIMEOUT=2592000 METHODS=TRUE XDR=TRUE

您可以通过从并行程序包中的snowSOCK.R文件中查看newPSOCKnode函数的源代码来确认这一点.

You can confirm this by looking at the source code for the newPSOCKnode function in the file snowSOCK.R from the parallel package.

为此,必须在本地计算机上使用ssh-compatible命令,并且必须在每台远程计算机上运行相应的ssh守护程序,否则makePSOCKcluster会简单地挂起.我发现在Windows上安装一个运行良好的ssh守护程序是困难的部分.

For this to work, the ssh-compatible command must be available on the local machine and a corresponding ssh daemon must be running on each of the remote machines, otherwise makePSOCKcluster will simply hang. I've found that installing a good, working ssh daemon is the difficult part on Windows.

不幸的是,manual=TRUE通常是在多台Windows计算机上创建PSOCK群集的最简单方法.

Unfortunately, manual=TRUE is generally the easiest way to create a PSOCK cluster on multiple Windows machines.

这篇关于R Parallel-连接到远程内核的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆