Python如何并行化循环 [英] Python how to parallelize loops

查看:235
本文介绍了Python如何并行化循环的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对多线程和多处理非常陌生,并尝试使for循环并行.我搜索了类似的问题,并基于多处理模块创建了代码. /p>

I am very new to multi-threading and multi-processing and trying to make for loop parallel. I searched similar questions, and created code based on multiprocessing module.

import timeit, multiprocessing

start_time = timeit.default_timer()

d1 = dict( (i,tuple([i*0.1,i*0.2,i*0.3])) for i in range(500000) )
d2={}

def fun1(gn):
    for i in gn:
        x,y,z = d1[i]
        d2.update({i:((x+y+z)/3)})


if __name__ == '__main__':
    gen1 = [x for x in d1.keys()]
    fun1(gen1)
    #p= multiprocessing.Pool(3)
    #p.map(fun1,gen1)

    print('Script finished')
    stop_time = timeit.default_timer()
    print(stop_time - start_time)

# 输出:

Script finished
0.8113944193950299

如果我更改代码,例如:

If I change code like:

#fun1(gen1)
p= multiprocessing.Pool(5)
p.map(fun1,gen1)

我收到错误消息:

for i in gn:
TypeError: 'int' object is not iterable
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
    raise self._value

有什么想法可以做到这一点吗? MATLAB有一个parfor选项来进行并行循环.我正在尝试使用这种方法使循环并行,但是它不起作用.有什么想法可以使循环并行吗?另外,如果函数返回值怎么办?如果fun1()返回3个值,我可以写类似a,b,c=p.map(fun1,gen1)的东西吗?

Any ideas how to make this parallel? MATLAB has a parfor option to make parallel loops. I am trying to make loop parallel using this approach, but it is not working. Any ideas how can I make loops parallel? Also, what if the function returns a value - can I write something like a,b,c=p.map(fun1,gen1) if fun1() returns 3 values?

(在Windows python 3.6上运行)

(Running on Windows python 3.6)

推荐答案

如@Alex Hall所述,请从fun1中删除迭代.另外,请等到所有池中的工人都完成为止.

As @Alex Hall mentioned, remove iteration from fun1. Also, wait till all pool's workers are finished.

PEP8注意:import timeit, multiprocessing是不好的做法,请将其分成两行.

PEP8 note: import timeit, multiprocessing is bad practice, split it to two lines.

import multiprocessing
import timeit


start_time = timeit.default_timer()

d1 = dict( (i,tuple([i*0.1,i*0.2,i*0.3])) for i in range(500000) )
d2 = {}

def fun1(gn):
    x,y,z = d1[gn]
    d2.update({gn: ((x+y+z)/3)})


if __name__ == '__main__':
    gen1 = [x for x in d1.keys()]

    # serial processing
    for gn in gen1:
        fun1(gn)

    # paralel processing
    p = multiprocessing.Pool(3)
    p.map(fun1, gen1)
    p.close()
    p.join()

    print('Script finished')
    stop_time = timeit.default_timer()
    print(stop_time - start_time)

这篇关于Python如何并行化循环的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆