探针似乎消耗了CPU [英] Probe seems to consume the CPU

查看:132
本文介绍了探针似乎消耗了CPU的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个MPI程序,该程序由一个主进程组成,该进程将命令移交给一堆从属进程.收到命令后,从站只需调用system()即可.从站等待命令时,它们消耗了各自CPU的100%.似乎Probe()处于紧密的循环中,但这只是一个猜测.您认为可能是什么原因导致的?我该怎么做才能解决?

I've got an MPI program consisting of one master process that hands off commands to a bunch of slave processes. Upon receiving a command, a slave just calls system() to do it. While the slaves are waiting for a command, they are consuming 100% of their respective CPUs. It appears that Probe() is sitting in a tight loop, but that's only a guess. What do you think might be causing this, and what could I do to fix it?

这是从属进程中等待命令的代码.同时查看日志和 top 命令表明,当从属正在消耗其CPU时,它们就在此功能内.

Here's the code in the slave process that waits for a command. Watching the log and the top command at the same time suggests that when the slaves are consuming their CPUs, they are inside this function.

MpiMessage
Mpi::BlockingRecv() {
  LOG(8, "BlockingRecv");

  MpiMessage result;
  MPI::Status status;

  MPI::COMM_WORLD.Probe(MPI_ANY_SOURCE, MPI_ANY_TAG, status);
  result.source = status.Get_source();
  result.tag = status.Get_tag();

  int num_elems = status.Get_count(MPI_CHAR);
  char buf[num_elems + 1];
  MPI::COMM_WORLD.Recv(
     buf, num_elems, MPI_CHAR, result.source, result.tag
  );
  result.data = buf;
  LOG(7, "BlockingRecv about to return (%d, %d)", result.source, result.tag);
  return result;
}

推荐答案

是;为了性能起见,大多数MPI实现都忙于阻塞操作.假设MPI作业是我们在处理器上唯一关心的事情,并且如果该任务被阻塞以等待通信,则最好的办法是连续轮询该通信以减少延迟.因此,从消息到达到将消息传递到MPI任务之间几乎没有延迟.通常,这意味着即使不执行任何实际"操作,CPU也会固定在100%.

Yes; most MPI implementations, for the sake of performance, busy-wait on blocking operations. The assumption is that the MPI job is the only thing going on that we care about on the processor, and if the task is blocked waiting for communications, the best thing to do is to continually poll for that communication to reduce latency; so that there's virtually no delay between when the message arrives and when it's handed off to the MPI task. This typically means that CPU is pegged at 100% even when nothing "real" is being done.

对于大多数MPI用户来说,这可能是最好的默认行为,但这并不总是您想要的.通常,MPI实现允许关闭此功能.使用OpenMPI,您可以使用MCA参数关闭此行为,

That's probably the best default behaviour for most MPI users, but it isn't always what you want. Typically MPI implementations allow turning this off; with OpenMPI, you can turn this behaviour off with an MCA parameter,

mpirun -np N --mca mpi_yield_when_idle 1 ./a.out

这篇关于探针似乎消耗了CPU的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆