如何使用 Python 多处理 Pool.map 在 for 循环中填充 numpy 数组 [英] How to use Python multiprocessing Pool.map to fill numpy array in a for loop
问题描述
我想在 for 循环中填充一个 2D-numpy 数组并通过使用多处理来加快计算.
I want to fill a 2D-numpy array within a for loop and fasten the calculation by using multiprocessing.
import numpy
from multiprocessing import Pool
array_2D = numpy.zeros((20,10))
pool = Pool(processes = 4)
def fill_array(start_val):
return range(start_val,start_val+10)
list_start_vals = range(40,60)
for line in xrange(20):
array_2D[line,:] = pool.map(fill_array,list_start_vals)
pool.close()
print array_2D
执行它的效果是 Python 运行了 4 个子进程并占用了 4 个 CPU 内核,但执行未完成且未打印数组.如果我尝试将数组写入磁盘,则没有任何反应.
The effect of executing it is that Python runs 4 subprocesses and occupies 4 CPU cores BUT the execution doesn´t finish and the array is not printed. If I try to write the array to the disk, nothing happens.
谁能告诉我为什么?
推荐答案
以下有效.首先,最好在主块内保护代码的主要部分,以避免奇怪的副作用.poo.map()
的结果是一个包含迭代器 list_start_vals
中每个值的评估的列表,这样您就不必创建 array_2D
之前.
The following works. First it is a good idea to protect the main part of your code inside a main block in order to avoid weird side effects. The result of poo.map()
is a list containing the evaluations for each value in the iterator list_start_vals
, such that you don't have to create array_2D
before.
import numpy as np
from multiprocessing import Pool
def fill_array(start_val):
return list(range(start_val, start_val+10))
if __name__=='__main__':
pool = Pool(processes=4)
list_start_vals = range(40, 60)
array_2D = np.array(pool.map(fill_array, list_start_vals))
pool.close() # ATTENTION HERE
print array_2D
也许你在使用 pool.close()
时会遇到问题,从@hpaulj 的评论中你可以删除这一行,以防你遇到问题......
perhaps you will have trouble using pool.close()
, from the comments of @hpaulj you can just remove this line in case you have problems...
这篇关于如何使用 Python 多处理 Pool.map 在 for 循环中填充 numpy 数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!