用MPI的Allreduce求和Python对象 [英] Summing Python Objects with MPI's Allreduce

查看:262
本文介绍了用MPI的Allreduce求和Python对象的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用我使用Python中的词典和计数器构建的稀疏张量数组操作。我想使它可以并行使用这个数组操作。最底层的是,我最终在每个节点上都有计数器,我想使用MPI.Allreduce(或另一个很好的解决方案)加在一起。例如,使用计数器可以这样做

  A = Counter({a:1,b:2,c:3}) 
B =计数器({b:1,c:2,d:3})

  C = A + B =计数器({a:1,b:3,c:5,d: 3})。 

我想做同样的操作,但与所有相关的节点,

  MPI.Allreduce(send_counter,recv_counter,MPI.SUM)

然而,MPI似乎没有在字典/计数器上识别此操作,抛出一个错误期望缓冲区或列表/元组。我最好的选择是用户定义的操作,还是有办法让Allreduce添加计数器?谢谢,



编辑(7/14/15):
我尝试为字典创建用户操作,但有一些差异。我写了以下

  def dict_sum(dict1,dict2,datatype):
for dict2:
尝试:
dict1 [key] + = dict2 [key]
除了KeyError:
dict1 [key] = dict2 [key]
pre>

当我告诉MPI有关我这样做的功能:

  dictSumOp = MPI.Op.Create(dict_sum,commute = True)

在代码I使用它作为

  the_result = comm.allreduce(mydict,dictSumOp)
pre>

但是,它为类型dict 抛出不支持的操作数'+'。所以我写了

  the_result = comm.allreduce(mydict,op = dictSumOp)

现在它抛出 dict1 [key] + = dict2 [key]
TypeError:'NoneType'属性__getitem __
所以显然
它想知道那些东西是字典?如何告诉它们是否有类型字典?

解决方案

MPI和MPI4py都不了解特定的计数器,所以你需要创建自己的操作减少这个工作;对于任何其他类型的python对象,这将是一样的:

 #!/ usr / bin / env python 
从mpi4py导入MPI
导入集合

def addCounter(counter1,counter2,datatype):
在counter2中的项目:
counter1 [item] + = counter2 [item ]
return counter1

如果__name __ ==__ main__:

comm = MPI.COMM_WORLD

如果comm.rank == 0:
myCounter = collections.Counter({'a':1,'b':2,'c':3})
else:
myCounter = collections.Counter({' b':1,'c':2,'d':3})


counterSumOp = MPI.Op.Create(addCounter,commute = True)

totcounter = comm.allreduce(myCounter,op = counterSumOp)
print comm.rank,totcounter

下面我们已经采取其中求和2个计数器对象的函数,并用MPI.Op.Create创建的MPI操作者出其中; mpi4py将解除对象,运行此功能将这些项目成对组合,然后腌制部分结果并将其发送到下一个任务。



请注意,我们使用(小写)allreduce,它适用于任意python对象,而不是(大写)Allreduce,它用于numpy数组
或其道德上的等价物(缓冲区,映射到MPI API设计的Fortran / C数组上)



运行给出:

  $ mpirun -np 2 python ./counter_reduce.py 
0 Counter({'c':5,'b':3,'d':3,'a':1})
1 Counter({'c' :5,'b':3,'d':3,'a':1})

$ mpirun -np 4 python ./counter_reduce.py
0 Counter({' c':9,'d':9,'b':5,'a':1})
2 Counter({'c':9,'d':9,'b' 'a':1})
1计数器({'c':9,'d':9,'b':5,'a':1})
3计数器:9, 'D':9, 'b':5, 'A':1})

只有适度的变化才能使用通用字典:

 #!来自mpi4py import的usr / bin / env python 
MPI

def addCounter(counter1,counter2,datatype):
counter2中的项目:
如果counter1中的项目:
counter1 [item] + = counter2 [item]
else:
counter1 [item] = counter2 [item]
return counter1

如果__name __ = =__ main__:

comm = MPI.COMM_WORLD

如果comm.rank == 0:
myDict = {'a':1,'c' :Hello}
else:
myDict = {'c':World!,'d':3}

counterSumOp = MPI.Op.Create(addCounter ,commute = True)

totDict = comm.allreduce(myDict,op = counterSumOp)
print comm.rank,totDict

运行给

  $ mpirun -np 2 python dict_reduce.py 
0,{ 'A':1, 'C': '您好!世界', 'D':3}
1 {一个:1,C:‘的Hello World’,D:3}


I am using a sparse tensor array manipulation I built using dictionaries and Counters in Python. I would like to make it possible to use this array manipulation in parallel. The bottom line is that I have ended up having Counters on each node which I would like to add together using MPI.Allreduce (or another nice solution). For instance with Counters one can do this

A = Counter({a:1, b:2, c:3})
B = Counter({b:1, c:2, d:3})

such that

C = A+B = Counter({a:1, b:3, c:5, d:3}).

I would like to do this same operation but with all the relevant nodes,

MPI.Allreduce(send_counter, recv_counter, MPI.SUM)

however, MPI doesn't seem to recognize this operation on dictionaries/Counters, throwing an error expecting a buffer or a list/tuple. Is my best option a `User-Defined Operation,' or is there a way to get Allreduce to add Counters? Thanks,

EDIT (7/14/15): I have attempted to create a user operation for dictionaries but there have been some discrepancies. I wrote the following

def dict_sum(dict1, dict2, datatype):
    for key in dict2:
        try:
            dict1[key] += dict2[key]
        except KeyError:
            dict1[key] = dict2[key]

and when I told MPI about the function I did this:

dictSumOp = MPI.Op.Create(dict_sum, commute=True)

and in the code I used it as

the_result = comm.allreduce(mydict, dictSumOp)

However, it threw unsupported operand '+' for type dict. so I wrote

the_result = comm.allreduce(mydict, op=dictSumOp)

and now it throws dict1[key] += dict2[key] TypeError: 'NoneType' object has no attribute '__getitem__' so apparently it wants to know those things are dictionaries? How do I tell it they do have type dictionary?

解决方案

Neither MPI nor MPI4py knows anything about Counters in particular, so you need to create your own reduction operation for this to work; this would be the same for any other sort of python object:

#!/usr/bin/env python
from mpi4py import MPI
import collections

def addCounter(counter1, counter2, datatype):
    for item in counter2:
        counter1[item] += counter2[item]
    return counter1

if __name__=="__main__":

    comm = MPI.COMM_WORLD

    if comm.rank == 0:
        myCounter = collections.Counter({'a':1, 'b':2, 'c':3})
    else:
        myCounter = collections.Counter({'b':1, 'c':2, 'd':3})


    counterSumOp = MPI.Op.Create(addCounter, commute=True)

    totcounter = comm.allreduce(myCounter, op=counterSumOp)
    print comm.rank, totcounter

Here we've taken a function which sums two counter objects and created an MPI operator out of them with MPI.Op.Create; mpi4py will unpickle the objects, run this function to combine these items pairwise, then pickle the partial result and send it off to the next task.

Note too that we're using (lowercase) allreduce, which works on arbitrary python objects, rather than (uppercase) Allreduce, which works on numpy arrays or their moral equivalents (buffers, which map onto the Fortran/C arrays that the MPI API is designed on).

Running gives:

$ mpirun -np 2 python ./counter_reduce.py 
0 Counter({'c': 5, 'b': 3, 'd': 3, 'a': 1})
1 Counter({'c': 5, 'b': 3, 'd': 3, 'a': 1})

$ mpirun -np 4 python ./counter_reduce.py 
0 Counter({'c': 9, 'd': 9, 'b': 5, 'a': 1})
2 Counter({'c': 9, 'd': 9, 'b': 5, 'a': 1})
1 Counter({'c': 9, 'd': 9, 'b': 5, 'a': 1})
3 Counter({'c': 9, 'd': 9, 'b': 5, 'a': 1})

And with only modest changes works with a generic dictionary:

#!/usr/bin/env python
from mpi4py import MPI

def addCounter(counter1, counter2, datatype):
    for item in counter2:
        if item in counter1:
            counter1[item] += counter2[item]
        else:
            counter1[item] = counter2[item]
    return counter1

if __name__=="__main__":

    comm = MPI.COMM_WORLD

    if comm.rank == 0:
        myDict = {'a':1, 'c':"Hello "}
    else:
        myDict = {'c':"World!", 'd':3}

    counterSumOp = MPI.Op.Create(addCounter, commute=True)

    totDict = comm.allreduce(myDict, op=counterSumOp)
    print comm.rank, totDict

Running giving

$ mpirun -np 2 python dict_reduce.py 
0 {'a': 1, 'c': 'Hello World!', 'd': 3}
1 {'a': 1, 'c': 'Hello World!', 'd': 3}

这篇关于用MPI的Allreduce求和Python对象的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆