对在Python中写入数组的函数循环进行多处理 [英] Multiprocessing a loop of a function that writes to an array in python

查看:472
本文介绍了对在Python中写入数组的函数循环进行多处理的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试为此循环实现多重处理.它无法修改数组,或者似乎不能正确地对作业进行排序(在完成最后一个功能之前返回数组).

I'm trying to implement multiprocessing for this loop. It fails to modify the array or and does not seem to order the jobs correctly (returns array before last function done).

import multiprocessing
import numpy


def func(i, array):
    array[i] = i**2
    print(i**2)

def main(n):
    array = numpy.zeros(n)

    if __name__ == '__main__':
        jobs = []
        for i in range(0, n):
            p = multiprocessing.Process(target=func, args=(i, array))
            jobs.append(p)
            p.start()

    return array

print(main(10))

推荐答案

进程不共享内存,您的程序最初将创建一个由零组成的数组,然后启动10个进程,这将在函数的副本上调用func函数.数组是在首次创建时创建的,而不是原始数组.

Processes do not share memory, your program initially will create an array full of zeroes, then start 10 processes, which will call the func function on a copy of the array when it was first created, but never the original array.

看来您真正要实现的目标是:

It seems like what you're really trying to accomplish is this:

from multiprocessing import Process, Lock
from multiprocessing.sharedctypes import Array


def modify_array(index, sharedarray):
    sharedarray[index] = index ** 2
    print([x for x in sharedarray])


def main(n):
    lock = Lock()
    array = Array('i', 10, lock=lock)
    if __name__ == '__main__':
        for i in range(0, n):
            p = Process(target=modify_array, args=(i, array))
            p.start()
            p.join()
    return list(array)

main(10)

输出:

[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[0, 1, 0, 0, 0, 0, 0, 0, 0, 0]
[0, 1, 4, 0, 0, 0, 0, 0, 0, 0]
[0, 1, 4, 9, 0, 0, 0, 0, 0, 0]
[0, 1, 4, 9, 16, 0, 0, 0, 0, 0]
[0, 1, 4, 9, 16, 25, 0, 0, 0, 0]
[0, 1, 4, 9, 16, 25, 36, 0, 0, 0]
[0, 1, 4, 9, 16, 25, 36, 49, 0, 0]
[0, 1, 4, 9, 16, 25, 36, 49, 64, 0]
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

但是问题是,使用多重处理是错误的.与新线程相比,产生一个额外的进程会产生很多开销,甚至只是保持单线程并利用事件循环来触发动作也是如此.

But the problem is, using multiprocessing is misguided. There's a lot of overhead in spawning an additional process, compared to a new thread, or even just staying single-threaded and utilizing an event loop to trigger actions.

在Python的单线程,单个进程中使用并发的示例可能如下所示:

An example of using concurrency, within a single-threaded, single process of Python may look like the following:

import numpy as np
from asyncio import get_event_loop, wait, ensure_future


def modify_array(index, array):
    array[index] = index ** 2
    print([x for x in array])


async def task(loop, function, index, array):
    await loop.run_in_executor(None, function, index, array)


def main(n):
    loop = get_event_loop()
    jobs = list()
    array = np.zeros(10)
    for i in range(0, n):
        jobs.append(
            ensure_future(
                task(loop, modify_array, i, array)
            )
        )
    loop.run_until_complete(wait(jobs))
    loop.close()

main(10)

这是当今流行的模式,使用asyncio事件循环并行完成任务.但是,由于您使用的是Numpy之类的库,所以我怀疑这种模式对您有多重要.

This is a popular pattern these days, of using asyncio event loops to accomplish tasks in parallel. However, since you're using a library such as Numpy, I question how valuable this pattern may be to you.

这篇关于对在Python中写入数组的函数循环进行多处理的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆