在Julia中将本地过程与远程过程相结合 [英] Combining local processes with remote processes in Julia
问题描述
我正在尝试结合本地进程使用远程进程,但是当我得到以下输出时
I'm trying to use remote processes in conjuntion with local processes, but when I do I get the following output
julia> addprocs(["user@host"], tunnel=true, dir="~/julia-79599ada44/bin/", sshflags=`-p 6969`)
id: cannot find name for group ID 350
1-element Array{Any,1}:
2
julia> addprocs(23)
fatal error on 2: ERROR: connect: host is unreachable (EHOSTUNREACH)
in wait at ./task.jl:284
in wait at ./task.jl:194
in stream_wait at stream.jl:263
in wait_connected at stream.jl:301
in Worker at multi.jl:113
in anonymous at task.jl:905
fatal error on fatal error on 5: 6: fatal error on fatal error on fatal error on 9: 14: 8: Worker 3 terminated.
...
我尝试过先添加本地进程,但是添加远程进程时会遇到相同的错误.
I have tried adding the local processes first but I get the same errors when I add the remote ones.
推荐答案
我知道这个问题很旧,但是今天有人问我是否知道这个未回答的问题的答案.
I know the question is old, but I was asked today if I knew the answer of this unanswered question.
您可以将-p
与--machinefile
选项一起使用:
You could use the -p
along with the --machinefile
options:
Julia可以使用
-p
或--machine-file
选项以并行模式启动.-p
n将启动另外的n个工作进程,而--machine-file
文件将为文件文件中的每一行启动一个工作程序.必须通过无密码 ssh登录名访问文件中定义的计算机,并在与当前主机相同的相同位置安装Julia.每个计算机定义均采用[count*][user@]host[:port] [bind_addr[:port]]
的形式.用户默认为当前用户,端口为标准ssh端口.count
是要在节点上生成的工作线程数,默认为1.可选的bind_addr[:port]
绑定指定其他工作线程用于连接到该工作线程的IP地址和端口.
Julia can be started in parallel mode with either the
-p
or the--machine-file
options.-p
n will launch an additional n worker processes, while--machine-file
file will launch a worker for each line in file file. The machines defined in file must be accessible via a password-less ssh login, with Julia installed at the same location as the current host. Each machine definition takes the form[count*][user@]host[:port] [bind_addr[:port]]
. user defaults to current user, port to the standard ssh port.count
is the number of workers to spawn on the node, and defaults to 1. The optional bind-tobind_addr[:port]
specifies the IP address and port that other workers should use to connect to this worker.
我使用--machinefile
选项已经很长时间了,在我的情况下,n
选项不起作用,我也不知道它是否已解决,但是您可以为每行添加一行例如,如果您不希望使用该工作程序,则该工作程序对您不起作用:
It has been a long time since I used the --machinefile
option, in my case the n
option didn't work and I don't know if it has been fixed, but you could add one line for each worker process you want instead, for example, if this doesn't work for you:
# machinefile.txt
23 user@host
您可以尝试以下方法:
# machinfile.txt
user@host
user@host
user@host
user@host
user@host
user@host
user@host
user@host
user@host
user@host
user@host
user@host
user@host
user@host
user@host
user@host
user@host
user@host
user@host
user@host
user@host
user@host
user@host
然后像下面这样调用julia:
And then invoke julia like:
$ julia -p 2 --machinefile machinefile.txt
总共25个进程(2个本地进程和23个远程进程).
For a total of 25 processes (2 local and 23 remote).
但是n
选项应该可以工作,如果有文档说明,否则请检查是否有错误,如果没有,请打开一个新的错误.
But the n
option should work if it is documented, else please check if there is a bug and if not, open a new one.
这篇关于在Julia中将本地过程与远程过程相结合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!