在python多处理中修改对象 [英] Modify object in python multiprocessing

查看:24
本文介绍了在python多处理中修改对象的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有大量自定义对象,我需要对它们执行独立(可并行化)任务,包括修改对象参数.我试过同时使用 Manager().dict 和 'sharedmem'ory,但都不起作用.例如:

将 numpy 导入为 np将多处理导入为 mp导入 sharedmem 作为 shm类测试员:数量 = 0.0名称 = '无'def __init__(self,tnum=num, tname=name):self.num = tnumself.name = tnamedef __str__(self):返回 '​​%f %s' % (self.num, self.name)def mod(测试,nn):test.num = np.random.randn()测试名称 = nn如果 __name__ == '__main__':数量 = 10测试 = np.empty(num, dtype=object)对于它的范围(数量):测试[it] = 测试员(tnum=it*1.0)sh_tests = shm.empty(num, dtype=object)对于它的范围(数量):sh_tests[it] = 测试[it]打印 sh_tests[it]打印'
'工人 = [ mp.Process(target=mod, args=(test, 'some') ) 用于 sh_tests 中的测试]对于工人的工作:work.start()对于工人的工作:work.join()在 sh_tests 中进行测试:打印测试

打印出来:

0.000000 无1.000000 无2.000000 无3.000000 无4.000000 无5.000000 无6.000000 无7.000000 无8.000000 无9.000000 无0.000000 无1.000000 无2.000000 无3.000000 无4.000000 无5.000000 无6.000000 无7.000000 无8.000000 无9.000000 无

即对象没有被修改.

我如何才能实现所需的行为?

解决方案

问题是,当对象传递给工作进程时,它们被pickle打包,运到另一个进程,在那里解包并工作在.您的对象并没有像克隆那样传递给其他进程.你不返回对象,所以克隆的对象被愉快地修改,然后扔掉.

这看起来是做不到的(Python:可以直接在 2 个独立进程之间共享内存数据).

您可以做的是返回修改后的对象.

将 numpy 导入为 np将多处理导入为 mp类测试员:数量 = 0.0名称 = '无'def __init__(self,tnum=num, tname=name):self.num = tnumself.name = tnamedef __str__(self):返回 '​​%f %s' % (self.num, self.name)def mod(test, nn, out_queue):打印 test.numtest.num = np.random.randn()打印 test.num测试名称 = nnout_queue.put(测试)如果 __name__ == '__main__':数量 = 10out_queue = mp.Queue()测试 = np.empty(num, dtype=object)对于它的范围(数量):测试[it] = 测试员(tnum=it*1.0)打印'
'工人 = [ mp.Process(target=mod, args=(test, 'some', out_queue) ) 用于测试中的测试]对于工人的工作:work.start()对于工人的工作:work.join()res_lst = []对于范围内的 j(len(workers)):res_lst.append(out_queue.get())在 res_lst 中进行测试:打印测试

这确实导致了一个有趣的观察,因为产生的进程是相同的,它们都以相同的随机数种子开始,所以它们全部生成相同的随机"数.>

I have a large array of custom objects which I need to perform independent (parallelizable) tasks on, including modifying object parameters. I've tried using both a Manager().dict, and 'sharedmem'ory, but neither is working. For example:

import numpy as np
import multiprocessing as mp
import sharedmem as shm


class Tester:

    num = 0.0
    name = 'none'
    def __init__(self,tnum=num, tname=name):
        self.num  = tnum
        self.name = tname

    def __str__(self):
        return '%f %s' % (self.num, self.name)

def mod(test, nn):
    test.num = np.random.randn()
    test.name = nn


if __name__ == '__main__':

    num = 10

    tests = np.empty(num, dtype=object)
    for it in range(num):
        tests[it] = Tester(tnum=it*1.0)

    sh_tests = shm.empty(num, dtype=object)
    for it in range(num):
        sh_tests[it] = tests[it]
        print sh_tests[it]

    print '
'
    workers = [ mp.Process(target=mod, args=(test, 'some') ) for test in sh_tests ]

    for work in workers: work.start()

    for work in workers: work.join()

    for test in sh_tests: print test

prints out:

0.000000 none
1.000000 none
2.000000 none
3.000000 none
4.000000 none
5.000000 none
6.000000 none
7.000000 none
8.000000 none
9.000000 none


0.000000 none
1.000000 none
2.000000 none
3.000000 none
4.000000 none
5.000000 none
6.000000 none
7.000000 none
8.000000 none
9.000000 none

I.e. the objects aren't modified.

How can I achieve the desired behavior?

解决方案

The problem is that when the objects are passed to the worker processes, they are packed up with pickle, shipped to the other process, where they are unpacked and worked on. Your objects aren't so much passed to the other process, as cloned. You don't return the objects, so the cloned object are happily modified, and then thrown away.

It looks like this can not be done (Python: Possible to share in-memory data between 2 separate processes) directly.

What you can do is return the modified objects.

import numpy as np
import multiprocessing as mp



class Tester:

    num = 0.0
    name = 'none'
    def __init__(self,tnum=num, tname=name):
        self.num  = tnum
        self.name = tname

    def __str__(self):
        return '%f %s' % (self.num, self.name)

def mod(test, nn, out_queue):
    print test.num
    test.num = np.random.randn()
    print test.num
    test.name = nn
    out_queue.put(test)




if __name__ == '__main__':       
    num = 10
    out_queue = mp.Queue()
    tests = np.empty(num, dtype=object)
    for it in range(num):
        tests[it] = Tester(tnum=it*1.0)


    print '
'
    workers = [ mp.Process(target=mod, args=(test, 'some', out_queue) ) for test in tests ]

    for work in workers: work.start()

    for work in workers: work.join()

    res_lst = []
    for j in range(len(workers)):
        res_lst.append(out_queue.get())

    for test in res_lst: print test

This does lead to the interesting observation that because the spawned processes are identical, they all start with the same seed for the random number, so they all generate the same 'random' number.

这篇关于在python多处理中修改对象的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆