Windows群集上的Parallel R [英] Parallel R on a Windows cluster

查看:80
本文介绍了Windows群集上的Parallel R的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个Windows HPC Server,它的后端运行着一些节点.我想使用后端的多个节点来运行ParallelR.我认为Parallel R可能在Windows上使用SNOW,但不太确定.我的问题是,是否还需要在后端节点上安装R? 假设我要使用两个节点,每个节点32个核心:

I've got a Windows HPC Server running with some nodes in the backend. I would like to run Parallel R using multiple nodes from the backend. I think Parallel R might be using SNOW on Windows, but not too sure about it. My question is, do I need to install R also on the backend nodes? Say I want to use two nodes, 32 cores per node:

cl <- makeCluster(c(rep("COMP01",32),rep("COMP02",32)),type="SOCK")

现在,它挂了.

我还需要做什么?后端节点是否需要运行某种sshd才能相互通信?

What else do I need to do? Do the backend nodes need some kind of sshd running to be able to communicate each other?

推荐答案

在Windows群集上设置snow相当困难.每台机器都需要安装R和snow,但这很容易.要启动SOCK群集,您需要在每台工作计算机上运行一个sshd守护程序,但是您仍然会遇到麻烦,因此除非您擅长调试和Windows系统管理,否则我不建议您这样做.

Setting up snow on a Windows cluster is rather difficult. Each of the machines needs to have R and snow installed, but that's the easy part. To start a SOCK cluster, you would need an sshd daemon running on each of the worker machines, but you can still run into troubles, so I wouldn't recommend it unless you're good at debugging and Windows system administration.

我认为Windows群集上最好的选择是使用MPI.我自己没有在Windows上使用MPI的经验,但是我听说有人在Windows的MPICH和DeinoMPI MPI发行版中取得了成功.一旦在群集上安装了MPI,您还需要从每个源计算机上的源计算机安装Rmpi软件包.然后,您可以使用makeMPIcluster函数创建群集对象.这是很多工作,但是由于Windows上ssh/sshd的问题,我认为它最终比尝试使用SOCK群集更有可能.

I think your best option on a Windows cluster is to use MPI. I don't have any experience with MPI on Windows myself, but I've heard of people having success with the MPICH and DeinoMPI MPI distributions for Windows. Once MPI is installed on your cluster, you also need to install the Rmpi package from source on each of your worker machines. You would then create the cluster object using the makeMPIcluster function. It's a lot of work, but I think it's more likely to eventually work than trying to use a SOCK cluster due to the problems with ssh/sshd on Windows.

如果您迫切希望在Windows群集上运行一次或两次并行作业,则可以尝试使用手动模式.它允许您创建没有ssh的SOCK集群:

If you're desperate to run a parallel job once or twice on a Windows cluster, you could try using manual mode. It allows you to create a SOCK cluster without ssh:

workers <- c(rep("COMP01",32), rep("COMP02",32))
cl <- makeSOCKluster(workers, manual=TRUE)

makeSOCKcluster函数将提示您启动每个工作程序,并显示要用于每个工作程序的命令.您必须在指定的计算机上手动打开命令窗口并执行指定的命令.这可能非常繁琐,尤其是对于许多工人而言,但是至少它并不复杂或棘手.与outfile=''选项结合使用时,它对于调试也非常有用.

The makeSOCKcluster function will prompt you to start each one of the workers, displaying the command to use for each. You have to manually open a command window on the specified machine and execute the specified command. It can be extremely tedious, particularly with many workers, but at least it's not complicated or tricky. It can also be very useful for debugging in combination with the outfile='' option.

这篇关于Windows群集上的Parallel R的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆