使用指定数量的节点启动Slurm阵列作业 [英] starting slurm array job with a specified number of nodes

查看:934
本文介绍了使用指定数量的节点启动Slurm阵列作业的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用Slurm版本14.03.0在我们的HPC上对齐168个序列文件.我一次最多只能使用9个计算节点,以使某些节点对其他人开放.

I’m trying to align 168 sequence files on our HPC using slurm version 14.03.0. I’m only allowed to use a maximum of 9 compute nodes at once to keep some nodes open for other people.

我更改了文件名,以便可以在sbatch中使用数组函数.序列文件如下所示: Sequence1.fastq.gz,Sequence2.fastq.gz,…Sequence168.fastq.gz

I changed the file names so I could use the array function in sbatch. The sequence files look like this: Sequence1.fastq.gz, Sequence2.fastq.gz, … Sequence168.fastq.gz

我似乎无法弄清楚如何告诉它一次运行全部168个文件.我可以运行全部168个文件,但是它使用了所有可用的节点,这将使我麻烦几天,因为它将运行几天.

I can’t seem to figure out how to tell it to run all 168 files, 9 at a time. I can get it to run all 168 files, but it uses all the available nodes, which will get me in trouble since this is going to run for a few days.

我发现我应该可以在哪里使用"--array = 1-168%9"来指定一次运行的数量,但这是在比我们的集群更高版本的Slurm中实现的.是否有其他方法可以获取此功能?我一直在尝试一些事情,并把头发拉了几个星期.

I’ve found where I should be able to use "--array=1-168%9" to specify how many to run at once, but this was implemented in a newer version of slurm than we have on our cluster. Is there an alternate way to get this functionality? I've been trying things and pulling my hair out for a couple of weeks.

我尝试运行它的方式是:

The way I’m trying to run it is:

#!/bin/bash
#SBATCH --job-name=McSeqs
#SBATCH --nodes=1
#SBATCH --array=1-168
srun alignmentProgramHere Sequence${SLURM_ARRAY_TASK_ID}.fastq.gz -o outputdirectory/

谢谢! 马特

推荐答案

所以我想出了一种使它起作用的方法.诀窍在于,所有sbatch选项都将传递给每个数组实例.我使用--exclude选项告诉每个数组实例不要使用一半的计算节点.因此,现在我一次运行9个文件,而计算节点则留给其他人使用.

So I figured out a way to make it work I think. The trick has been that the sbatch options all get passed to each array instance. I used the --exclude option to tell each array instance not to use half of the compute nodes. So now I'm running 9 of my files at once, leaving compute nodes open for other people.

#!/bin/bash
#SBATCH --job-name=McSeqs
#SBATCH --nodes=1
#SBATCH --array=1-168
#SBATCH --exclude=cluster[10-20]

srun alignmentProgramHere Sequence${SLURM_ARRAY_TASK_ID}.fastq.gz -o outputdirectory/

这篇关于使用指定数量的节点启动Slurm阵列作业的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆