Python的多重处理:加快几套参数的for循环,“套用"与"apply_async" [英] Python's multiprocessing: speed up a for-loop for several sets of parameters, "apply" vs. "apply_async"

查看:206
本文介绍了Python的多重处理:加快几套参数的for循环,“套用"与"apply_async"的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用许多不同的参数组合来集成一个微分方程组,并存储属于某个参数集的变量的最终值.因此,我实现了一个简单的for循环,其中创建了随机的初始条件和参数组合,对系统进行了集成,并将感兴趣的值存储在各个数组中. 由于我打算对相当复杂的系统(这里我仅使用玩具系统进行说明)的许多参数组合执行此操作,该参数也会变得僵硬,因此我想并行化仿真以使用Python的多处理"来加快处理过程模块.

I would like to integrate a system of differential equations using a lot of different parameter combinations and store the variables’ final values that belong to a certain set of parameters. Therefore, I implemented a simple for-loop in which random initial conditions and parameter combinations are created, the system is integrated and the values of interest are stored in the respective arrays. Since I intend to do this for many parameter combinations for a rather complex system (here I only use a toy system for illustration), which can also become stiff, I would like to parallelize the simulations to speed up the process using Python’s "multiprocessing" module.

但是,当我运行仿真时,for循环总是比其并行化版本快.比到目前为止我发现的for循环更快的唯一方法是使用"apply_async"而不是"apply".对于10种不同的参数组合,我得到例如以下输出(使用下面的代码):

However, when I run the simulations, the for-loop is always faster than its parallelized version. The only way to be faster than the for-loop I’ve found so far, is to use "apply_async" instead of "apply". For 10 different parameter combinations, I get for example the following output (using the code from below):

The for loop took  0.11986207962 seconds!
[ 41.75971761  48.06034375  38.74134139  25.6022232   46.48436046
  46.34952734  50.9073202   48.26035086  50.05026187  41.79483135]
Using apply took  0.180637836456 seconds!
41.7597176061
48.0603437545
38.7413413879
25.6022231983
46.4843604574
46.3495273394
50.9073202011
48.2603508573
50.0502618731
41.7948313502
Using apply_async took  0.000414133071899 seconds!
41.7597176061
48.0603437545
38.7413413879
25.6022231983
46.4843604574
46.3495273394
50.9073202011
48.2603508573
50.0502618731
41.7948313502

尽管在此示例中,"apply"和"apply_async"的结果顺序相同,但通常情况下似乎并非如此.因此,我想使用"apply_async",因为它要快得多,但是在这种情况下,我不知道如何将模拟结果与用于各个模拟的参数/初始条件进行匹配.

Although in this example the order of the results are identical for "apply" and "apply_async", this seems not to be true in general. So, I would like to use "apply_async" since it is much faster but in this case I don’t know how I can match the outcome of the simulations to the parameters/initial conditions I used for the respective simulation.

因此,我的问题是:

1)在这种情况下,为什么应用"比简单的for循环要慢得多?

1) Why is "apply" much slowlier than the simple for-loop in this case?

2)当我使用"apply_async"而不是"apply"时,并行化版本比for循环快得多,但是我如何才能将模拟结果与各自模拟中使用的参数进行匹配?

2) When I use "apply_async" instead of "apply", the parallelized version becomes very much faster than the for-loop but how can I then match the outcome of the simulations to the parameters I used in the respective simulation?

3)在这种情况下,"apply"和"apply_async"的结果具有相同的顺序.这是为什么?巧合吗?

3) In this case, the results of "apply" and "apply_async" have the same order. Why is that? Coincidence?

我的代码可以在下面找到:

My code can be found below:

from pylab import *
import multiprocessing as mp
from scipy.integrate import odeint
import time

#my system of differential equations
def myODE (yn,tvec,allpara):

    (x, y, z) = yn

    a, b = allpara['para']

    dx  = -x + a*y + x*x*y
    dy = b - a*y - x*x*y
    dz = x*y

    return (dx, dy, dz) 

#for reproducibility    
seed(0) 

#time settings for integration
dt = 0.01
tmax = 50
tval = arange(0,tmax,dt)

numVar = 3 #number of variables (x, y, z)
numPar = 2 #number of parameters (a, b)
numComb = 10 #number of parameter combinations

INIT = zeros((numComb,numVar)) #initial conditions will be stored here
PARA = zeros((numComb,numPar)) #parameter combinations for a and b will be stored here
RES = zeros(numComb) #z(tmax) will be stored here

tic = time.time()

for combi in range(numComb):

    INIT[combi,:] = append(10*rand(2),0) #initial conditions for x and y are randomly chosen, z is 0

    PARA[combi,:] = 10*rand(2) #parameter a and b are chosen randomly

    allpara = {'para': PARA[combi,:]}

    results = transpose(odeint(myODE, INIT[combi,:], tval, args=(allpara,))) #integrate system

    RES[combi] = results[numVar - 1][-1] #store z

    #INIT[combi,:] = results[:,-1] #update initial conditions
    #INIT[combi,-1] = 0 #set z to 0

toc = time.time()

print 'The for loop took ', toc-tic, 'seconds!'

print RES

#function for the multi-processing part
def runMyODE(yn,tvec,allpara):

    return transpose(odeint(myODE, yn, tvec, args=(allpara,)))

tic = time.time()

pool = mp.Pool(processes=4)
results = [pool.apply(runMyODE, args=(INIT[combi,:],tval,{'para': PARA[combi,:]})) for combi in range(numComb)]

toc = time.time()

print 'Using apply took ', toc-tic, 'seconds!'

for sol in range(numComb):
    print results[sol][2,-1] #print final value of z

tic = time.time()    
resultsAsync = [pool.apply_async(runMyODE, args=(INIT[combi,:],tval,{'para': PARA[combi,:]})) for combi in range(numComb)]    
toc = time.time()
print 'Using apply_async took ', toc-tic, 'seconds!'

for sol in range(numComb):
    print resultsAsync[sol].get()[2,-1] #print final value of z

推荐答案

请注意,您的apply_async比for循环快289倍,这有点可疑!现在,即使不是最大并行度所需的结果,也可以保证按提交顺序获得结果.

Note that the fact that your apply_async is 289 times faster then the for loop is a little suspicious! And right now, you're guaranteed to get the results in the order they're submitted, even if that isn't what you want for maximum parallelism.

apply_async启动一个任务,它不会等到完成为止; .get()可以做到这一点.因此:

apply_async starts a task, it doesn't wait until it's completed; .get() does that. So this:

tic = time.time()    
resultsAsync = [pool.apply_async(runMyODE, args=(INIT[combi,:],tval,{'para': PARA[combi,:]})) for combi in range(numComb)]    
toc = time.time()

这并不是很公平的衡量标准;您已经开始了所有任务,但是它们不一定已经完成.

Isn't really a very fair measurement; you've started all the tasks, but they're not necessarily completed yet.

另一方面,一旦您获得.get()结果,就知道任务已完成并且您有答案;这样做

On the other hand, once you .get() the results, you know that the task has completed and that you have the answer; so doing this

for sol in range(numComb):
    print resultsAsync[sol].get()[2,-1] #print final value of z

表示确保您有顺序的结果(因为您正在按顺序通过ApplyResult对象并对其进行.get()处理);但是您可能希望结果准备好后立即获得结果,而不是一次阻塞一次地等待.但这意味着您需要以一种或另一种方式用其参数标记结果.

Means that for sure you have the results in order (because you're going through the ApplyResult objects in order and .get()ing them); but you might want to have the results as soon as they're ready rather than doing a blocking wait on the steps one at a time. But that means you'd need to label the results with their parameters one way or another.

您可以使用回调函数在任务完成后保存结果,并返回参数和结果,以允许完全异步返回:

You can use callbacks to save the results once the tasks are done, and return the parameters along with the results, to allow completely asynchronous returns:

def runMyODE(yn,tvec,allpara):
    return allpara['para'],transpose(odeint(myODE, yn, tvec, args=(allpara,)))

asyncResults = []

def saveResult(result):
    asyncResults.append((result[0], result[1][2,-1]))

tic = time.time()
for combi in range(numComb):
    pool.apply_async(runMyODE, args=(INIT[combi,:],tval,{'para': PARA[combi,:]}), callback=saveResult)
pool.close()
pool.join()
toc = time.time()

print 'Using apply_async took ', toc-tic, 'seconds!'

for res in asyncResults:
    print res[0], res[1]

给您一个更合理的时间;结果几乎总是井井有条,因为任务花费的时间非常相似:

Gives you a more reasonable time; the results are still almost always in order because the tasks take very similar amounts of time:

Using apply took  0.0847041606903 seconds!
[ 6.02763376  5.44883183] 41.7597176061
[ 4.37587211  8.91773001] 48.0603437545
[ 7.91725038  5.2889492 ] 38.7413413879
[ 0.71036058  0.871293  ] 25.6022231983
[ 7.78156751  8.70012148] 46.4843604574
[ 4.61479362  7.80529176] 46.3495273394
[ 1.43353287  9.44668917] 50.9073202011
[ 2.64555612  7.74233689] 48.2603508573
[ 0.187898    6.17635497] 50.0502618731
[ 9.43748079  6.81820299] 41.7948313502
Using apply_async took  0.0259671211243 seconds!
[ 4.37587211  8.91773001] 48.0603437545
[ 0.71036058  0.871293  ] 25.6022231983
[ 6.02763376  5.44883183] 41.7597176061
[ 7.91725038  5.2889492 ] 38.7413413879
[ 7.78156751  8.70012148] 46.4843604574
[ 4.61479362  7.80529176] 46.3495273394
[ 1.43353287  9.44668917] 50.9073202011
[ 2.64555612  7.74233689] 48.2603508573
[ 0.187898    6.17635497] 50.0502618731
[ 9.43748079  6.81820299] 41.7948313502

请注意,您也可以使用map而不是套用遍历:

Note that rather than looping over apply, you could also use map:

pool.map_async(lambda combi: runMyODE(INIT[combi,:], tval, para=PARA[combi,:]), range(numComb), callback=saveResult)

这篇关于Python的多重处理:加快几套参数的for循环,“套用"与"apply_async"的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆