充分利用SLURM上的所有CPU [英] Make use of all CPUs on SLURM

查看:200
本文介绍了充分利用SLURM上的所有CPU的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在集群上运行作业.在不同的节点上有不同数量的CPU,我不知道哪个节点将分配给我.有什么适当的选项可以使作业在所有节点上创建与CPU一样多的任务?

I would like to run a job on the cluster. There are a different number of CPUs on different nodes and I have no idea which nodes will be assigned to me. What are the proper options so that the job can create as many tasks as CPUs on all nodes?

#!/bin/bash -l

#SBATCH -p normal
#SBATCH -N 4
#SBATCH -t 96:00:00

srun -n 128 ./run

推荐答案

实现目标的一个肮脏技巧是使用SLURM提供的环境变量.对于样本文件:

One dirty hack to achieve the objective is using the environment variables provided by the SLURM. For a sample sbatch file:

#!/bin/bash
#SBATCH --job-name=test
#SBATCH --output=res.txt
#SBATCH --time=10:00
#SBATCH --nodes=2
echo $SLURM_CPUS_ON_NODE
echo $SLURM_JOB_NUM_NODES   
num_core=$SLURM_CPUS_ON_NODE
num_node=$SLURM_JOB_NUM_NODES
let proc_num=$num_core*$num_node
echo $proc_num
srun -n $proc_num ./run

作业脚本中仅请求节点数. $ SLURM_CPUS_ON_NODE 将提供每个节点的cpus数.您可以将其与其他环境变量(例如: $ SLURM_JOB_NUM_NODES )一起使用,以了解可能的任务数量.在上面的脚本中,动态任务计算是在节点是同质的前提下完成的(即 $ SLURM_CPUS_ON_NODE 仅给出一个数字).

Only the number of nodes are requested in the job script. $SLURM_CPUS_ON_NODE will provide the number of cpus per node. You can use it along with other environment variables (eg: $SLURM_JOB_NUM_NODES) to know the number of tasks possible. In the above script dynamic task calculation is done with the assumption that the nodes are homogenous (i.e $SLURM_CPUS_ON_NODE will give only single number ).

对于异构节点, $ SLURM_CPUS_ON_NODE 将给出多个值(例如:如果分配的节点具有2和3 cpus,则为2,3).在这种情况下,可以使用 $ SLURM_JOB_NODELIST 来查找与分配的节点相对应的cpus数量,并以此计算所需的任务.

For heterogeneous nodes, $SLURM_CPUS_ON_NODE will give multiple values (eg: 2,3 if the nodes allocated has 2 and 3 cpus). In such scenario, $SLURM_JOB_NODELIST can be used to find out the number of cpus corresponding to the allocated nodes and with that you can calculate the required tasks.

这篇关于充分利用SLURM上的所有CPU的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆