计算独立核心的使用并将流程绑定到核心 [英] Computing usage of independent cores and binding a process to a core

查看:86
本文介绍了计算独立核心的使用并将流程绑定到核心的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用MPI,并且具有一定的操作层次结构.对于参数_param的特定值,我启动了10个试验,每个试验都在不同的内核上运行特定的进程.对于_param的n个值,代码按以下特定层次运行:

I am working with MPI, and I have a certain hierarchy of operations. For a particular value of a parameter _param, I launch 10 trials, each running a specific process on a distinct core. For n values of _param, the code runs in a certain hierarchy as:

driver_file-> 启动一个进程,该进程检查可用进程是否超过10个.如果可用进程超过10个,则它将启动一个进程实例,并将特定的_param值作为参数传递给coupling_file

driver_file -> launches one process which checks if available processes are more than 10. If more than 10 are available, then it launches an instance of a process with a specific _param value passed as an argument to coupling_file

耦合文件-> 进行一些基本计算,然后使用MPI_Comm_spawn()启动10个进程,每个进程对应于trial_file,同时将_trial作为参数

coupling_file -> does some elementary computation, and then launches 10 processes using MPI_Comm_spawn(), each corresponding to a trial_file while passing _trial as an argument

试用文件-> 计算工作,将值返回到coupling_file

trial_file -> computes work, returns values to the coupling_file

我面临两个难题,即:

  1. 如何评估driver_file中内核的所需条件? 与之类似,我如何找出终止了多少个进程,以便可以正确地在空闲内核上调度进程?我以为可能添加一个阻止MPI_Recv()并使用它传递一个变量,该变量会告诉我某个进程何时完成,但是我不确定这是否是最佳解决方案.

  1. How do I evaluate the required condition for the cores in driver_file? As in, how do I find out how many processes have been terminated, so that I can correctly schedule processes on idle cores? I thought maybe adding a blocking MPI_Recv() and use it to pass a variable which would tell me when a certain process has been finished, but I'm not sure if this is the best solution.

如何确保将进程分配给不同的内核?我曾考虑过使用mpiexec --bind-to-core --bycore -n 1 coupling_file之类的东西来启动一个coupling_file.接下来是类似mpiexec --bind-to-core --bycore -n 10 trial_file的内容 由coupling_file启动.但是,如果我将进程绑定到一个内核,则我不希望同一个内核具有两个/多个进程.与之类似,我不希望_coupling_1_trial_1在内核x上运行,然后我启动了coupling_2的另一个进程,该进程启动了也绑定到内核x_trial_2.

How do I ensure that processes are assigned to different cores? I had thought about using something like mpiexec --bind-to-core --bycore -n 1 coupling_file to launch one coupling_file. This will be followed by something like mpiexec --bind-to-core --bycore -n 10 trial_file launched by the coupling_file. However, if I am binding processes to a core, I don't want the same core to have two/more processes. As in, I don't want _trial_1 of _coupling_1 to run on core x, then I launch another process of coupling_2 which launches _trial_2 which also gets bound to core x.

任何输入将不胜感激.谢谢!

Any input would be appreciated. Thanks!

推荐答案

如果这是您的选择,我将完全放弃生成进程,而是立即启动所有进程. 然后,您可以轻松地将它们划分为多个块,以完成一个任务.例如,您对概念的翻译可能是:

If it is an option for you, I'd drop the spawning processes thing altogether, and instead start all processes at once. You can then easily partition them into chunks working on a single task. A translation of your concept could for example be:

  • 使用一位大师(等级0)
  • 将其余部分划分为10个过程的组,如果需要的话,可以为每个组创建一个新的传播者,每个组都有一个领导者过程,这是主机已知的.

然后,您可以在代码中执行以下操作:

In your code you then can do something like:

if master:
    send a specific _param to each group leader (with a non-blocking send)
    loop over all your different _params
        use MPI_Waitany or MPI_Waitsome to find groups that are ready
else
    if groupleader:
        loop endlessly
            MPI_Recv _params from master
            coupling_file
            MPI_Bcast to group
            process trial_file
    else
        loop endlessly
            MPI_BCast (get data from groupleader)
            process trial file

我认为,采用这种方法将使您能够解决这两个问题. MPI_Wait *可以检测到进程组的可用性,尽管您可能想更改上面的逻辑,以便在任务结束时通知主服务器,以便它只发送新数据,而在上一个试用版中还没有运行,而另一个进程组可能更快.固定数量的进程可以解决固定问题,可以在通常的启动过程中正确固定这些进程.

I think, following this approach would allow you to solve both your issues. Availability of process groups gets detected by MPI_Wait*, though you might want to change the logic above, to notify the master at the end of your task so it only sends new data then, not already during the previous trial is still running, and another process group might be faster. And pinning is resolved as you have a fixed number of processes, which can be properly pinned during the usual startup.

这篇关于计算独立核心的使用并将流程绑定到核心的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆