MPI、python、Scatterv 和重叠数据 [英] MPI, python, Scatterv, and overlapping data

查看:60
本文介绍了MPI、python、Scatterv 和重叠数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

MPI 标准 3.0 说明了 mpi_scatterv:

The MPI standard, 3.0, says about mpi_scatterv:

计数、类型和位移的规范不应导致根上的任何位置被多次读取."

The specification of counts, types, and displacements should not cause any location on the root to be read more than once."

但是,我用下面的代码在python中测试mpi4py并没有表明从root读取数据不止一次有问题:

However, my testing of mpi4py in python with the code below does not indicate that there is a problem with reading data from root more than once:

import numpy as np
from sharethewealth import sharethewealth

comm = MPI.COMM_WORLD
nprocs = comm.Get_size()
rank = comm.Get_rank()

counts = [16, 17, 16, 16, 16, 16, 15]
displs = [0, 14, 29, 43, 57, 71, 85]

if rank == 0:
    bigx = np.arange(100, dtype=np.float64)
else:
    bigx = None

my_size = counts[rank]
x = np.zeros(my_size)

comm.Scatterv([bigx, counts, displs, MPI.DOUBLE], x, root = 0)

print x

命令

> mpirun -np 7 python mycode.py

生产

[ 57.  58.  59.  60.  61.  62.  63.  64.  65.  66.  67.  68.  69.  70.  71. 72.]

[ 85.  86.  87.  88.  89.  90.  91.  92.  93.  94.  95.  96.  97.  98.  99.]

[  0.   1.   2.   3.   4.   5.   6.   7.   8.   9.  10.  11.  12.  13.  14. 15.]

[ 29.  30.  31.  32.  33.  34.  35.  36.  37.  38.  39.  40.  41.  42.  43. 44.]

[ 43.  44.  45.  46.  47.  48.  49.  50.  51.  52.  53.  54.  55.  56.  57. 58.]

[ 71.  72.  73.  74.  75.  76.  77.  78.  79.  80.  81.  82.  83.  84.  85. 86.]

[ 14.  15.  16.  17.  18.  19.  20.  21.  22.  23.  24.  25.  26.  27.  28. 29.  30.]

输出显然是正确的,并且来自根(进程 0)的数据显然在每个边界点被引用了不止一次.我是不是不理解 MPI 标准?或者这是一种普遍不能依赖的偶然行为?

The output is clearly correct and the data from root ( process 0) has clearly been referenced more than once at each of the boundary points. Am I not understanding the MPI standard? Or is this a fortuitous behavior that cannot be relied on in general?

FWIW,我在 OSX 上运行 python 2.7.

FWIW, I'm running python 2.7 on OSX.

推荐答案

你不能依赖这个.

这个假设直接源于 MPI 标准.由于 mpi4py 大写函数只是 MPI 之上的一个薄层,所以这很重要.该标准还指出:

This assumption stems directly from the MPI standard. Since mpi4py upper-case functions are just a thin layer on top of MPI, this is what matters. The standard also states:

基本原理.虽然不是必需的,但最后一个限制是为了用 MPI_GATHER 实现对称,其中相应的限制(多写限制)是必要的.(基本原理结束.)

Rationale. Though not needed, the last restriction is imposed so as to achieve symmetry with MPI_GATHER, where the corresponding restriction (a multiple-write restriction) is necessary. (End of rationale.)

考虑到它在标准中,MPI 实现可能会使用它:

Considering it is in the standard, an MPI implementation may make use of that:

  • 忽略违规
  • 违反时发出警告
  • 违反时失败
  • 将此假设用于在违反时可能导致未定义行为的任​​何类型的优化

最后一点最可怕,因为它可能会引入细微的错误.考虑到发送缓冲区的只读性质,很难想象这样的优化,但这并不意味着它存在/不会存在.作为一个想法,请考虑严格的别名优化.另请注意,MPI 实现非常复杂 - 它们的行为可能会在版本、配置、数据大小或其他环境变化之间发生看似不稳定的变化.

The last point is most scary as it may introduce subtle bugs. Considering the read-only nature of the send buffer, it is difficult to imagine such an optimization, but that doesn't mean it does/will not exist. Just as an idea consider strict aliasing optimizations. Also note that MPI implementations are very complex - their behavior may change seemingly erratic between versions, configurations, data sizes or other environmental changes.

还有一个关于 memcpy 的臭名昭著的例子:该标准禁止重叠内存输入,并且在某些时候 glibc 实现利用它进行了一个微小的有争议的优化.不满足要求的代码开始失败,用户开始在 mp3 flash 上听到 奇怪的声音网站,然后是激烈辩论,涉及 Linus Torvalds 和 Ulrich Drepper.

There is also an infamous example with memcpy: The standard forbids overlapping memory inputs, and at some point the glibc implementation made use of that for a tiny disputed optimization. Code that did not satisfied the requirement started to fail, and users started to hear strange sound on mp3 flash websites, followed by a heated debate involving Linus Torvalds and Ulrich Drepper.

故事的精神是:遵循标准强加的要求,即使它现在有效并且要求对您没有意义.也庆幸有这么详细的标准.

The morale of the story is: Follow the requirements imposed by the standard, even if it works right now and the requirement doesn't make sense to you. Also be glad that there is such a detailed standard.

这篇关于MPI、python、Scatterv 和重叠数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆