如何获得aprun排名 [英] How to get rank in aprun

查看:383
本文介绍了如何获得aprun排名的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图用aprun运行一个多节点的工作。然而,我无法弄清楚如何获得军衔(或任何用作每个作业的ID)在bash的环境。像这样简单的工作:

I am trying to run a multi-node jobs with aprun. However, I couldn't figure out how to get the rank (or whatever that serves as the ID of each job) in bash environment. Like this simple job:

aprun -n 8 -N 2 ./examplebashscript.sh

我怎样才能在每个级别产生的作业?
如果没有像一个等级或任何唯一的作业ID,这aprun线将只运行完全一样的节目16次,这是不可取的。

How can I get the rank in each spawned jobs? Without something like a rank or any unique job ID, this aprun line will only run the exact same program 16 times, which is undesirable.

我一直在阅读文档时,令人惊讶的我找不到任何东西,甚至解释了aprun提供的默认变量。

I've been reading on the documentation, surprisingly I couldn't find anything that even explains the default variables provided by aprun.

我已经与中的mpirun猛砸工作过,我知道怎么去使用C和Python程序的每个岗位的等级值,但不能。 aprun甚至更少的记录。

I've worked with mpirun before, which I know how to get the rank values of each jobs using C and Python programs, but not in Bash. aprun is even less documented.

推荐答案

尝试寻找环境变量的 ALPS_APP_PE 在你已经aprun-ED的bash脚本。

Try looking for environment variable ALPS_APP_PE in the bash script that you have aprun-ed.

这将是脚本的每个实例不同(创建的实例数量是由aprun命令-n选项中给出)。

It will be different for each instance of the script (number of instances created is given by the -n option in the aprun command).

如果脚本随后执行MPI程序的一个实例,该实例将不得不ALPS_APP_PE给MPI等级值。

If the script subsequently executes an instance of the MPI program, that instance will have MPI rank value given by ALPS_APP_PE.

需要注意的是,有些网站的Cray可能决定不公开此变量,或者使用不同的名称。很老的版本ALPS也并不支持它,但这些都是罕见的。

The caveat is that some Cray sites may decide not to expose this variable, or to use a different name. Very old ALPS versions also don't support it, but these are rare.

为例参见本CUG 2014年纸:

See this CUG 2014 paper for an example:

https://cug.org/proceedings/cug2014_proceedings/includes/files/ pap136.pdf

这篇关于如何获得aprun排名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆