Python的多重处理:加快几套参数的for循环，“套用"与"apply_async" [英] Python's multiprocessing: speed up a for-loop for several sets of parameters, "apply" vs. "apply_async"

查看：206 发布时间：2020/5/24 21:35:15 python performance parallel-processing python-multiprocessing odeint

本文介绍了Python的多重处理:加快几套参数的for循环，“套用"与"apply_async"的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想使用许多不同的参数组合来集成一个微分方程组，并存储属于某个参数集的变量的最终值.因此，我实现了一个简单的for循环，其中创建了随机的初始条件和参数组合，对系统进行了集成，并将感兴趣的值存储在各个数组中. 由于我打算对相当复杂的系统(这里我仅使用玩具系统进行说明)的许多参数组合执行此操作，该参数也会变得僵硬，因此我想并行化仿真以使用Python的多处理"来加快处理过程模块.

I would like to integrate a system of differential equations using a lot of different parameter combinations and store the variables’ final values that belong to a certain set of parameters. Therefore, I implemented a simple for-loop in which random initial conditions and parameter combinations are created, the system is integrated and the values of interest are stored in the respective arrays. Since I intend to do this for many parameter combinations for a rather complex system (here I only use a toy system for illustration), which can also become stiff, I would like to parallelize the simulations to speed up the process using Python’s "multiprocessing" module.

但是，当我运行仿真时，for循环总是比其并行化版本快.比到目前为止我发现的for循环更快的唯一方法是使用"apply_async"而不是"apply".对于10种不同的参数组合，我得到例如以下输出(使用下面的代码):

However, when I run the simulations, the for-loop is always faster than its parallelized version. The only way to be faster than the for-loop I’ve found so far, is to use "apply_async" instead of "apply". For 10 different parameter combinations, I get for example the following output (using the code from below):

The for loop took  0.11986207962 seconds!
[ 41.75971761  48.06034375  38.74134139  25.6022232   46.48436046
  46.34952734  50.9073202   48.26035086  50.05026187  41.79483135]
Using apply took  0.180637836456 seconds!
41.7597176061
48.0603437545
38.7413413879
25.6022231983
46.4843604574
46.3495273394
50.9073202011
48.2603508573
50.0502618731
41.7948313502
Using apply_async took  0.000414133071899 seconds!
41.7597176061
48.0603437545
38.7413413879
25.6022231983
46.4843604574
46.3495273394
50.9073202011
48.2603508573
50.0502618731
41.7948313502

尽管在此示例中，"apply"和"apply_async"的结果顺序相同，但通常情况下似乎并非如此.因此，我想使用"apply_async"，因为它要快得多，但是在这种情况下，我不知道如何将模拟结果与用于各个模拟的参数/初始条件进行匹配.

Although in this example the order of the results are identical for "apply" and "apply_async", this seems not to be true in general. So, I would like to use "apply_async" since it is much faster but in this case I don’t know how I can match the outcome of the simulations to the parameters/initial conditions I used for the respective simulation.

因此，我的问题是:

1)在这种情况下，为什么应用"比简单的for循环要慢得多?

1) Why is "apply" much slowlier than the simple for-loop in this case?

2)当我使用"apply_async"而不是"apply"时，并行化版本比for循环快得多，但是我如何才能将模拟结果与各自模拟中使用的参数进行匹配?

2) When I use "apply_async" instead of "apply", the parallelized version becomes very much faster than the for-loop but how can I then match the outcome of the simulations to the parameters I used in the respective simulation?

3)在这种情况下，"apply"和"apply_async"的结果具有相同的顺序.这是为什么?巧合吗?

3) In this case, the results of "apply" and "apply_async" have the same order. Why is that? Coincidence?

我的代码可以在下面找到:

My code can be found below:

from pylab import *
import multiprocessing as mp
from scipy.integrate import odeint
import time

#my system of differential equations
def myODE (yn,tvec,allpara):

    (x, y, z) = yn

    a, b = allpara['para']

    dx  = -x + a*y + x*x*y
    dy = b - a*y - x*x*y
    dz = x*y

    return (dx, dy, dz) 

#for reproducibility    
seed(0) 

#time settings for integration
dt = 0.01
tmax = 50
tval = arange(0,tmax,dt)

numVar = 3 #number of variables (x, y, z)
numPar = 2 #number of parameters (a, b)
numComb = 10 #number of parameter combinations

INIT = zeros((numComb,numVar)) #initial conditions will be stored here
PARA = zeros((numComb,numPar)) #parameter combinations for a and b will be stored here
RES = zeros(numComb) #z(tmax) will be stored here

tic = time.time()

for combi in range(numComb):

    INIT[combi,:] = append(10*rand(2),0) #initial conditions for x and y are randomly chosen, z is 0

    PARA[combi,:] = 10*rand(2) #parameter a and b are chosen randomly

    allpara = {'para': PARA[combi,:]}

    results = transpose(odeint(myODE, INIT[combi,:], tval, args=(allpara,))) #integrate system

    RES[combi] = results[numVar - 1][-1] #store z

    #INIT[combi,:] = results[:,-1] #update initial conditions
    #INIT[combi,-1] = 0 #set z to 0

toc = time.time()

print 'The for loop took ', toc-tic, 'seconds!'

print RES

#function for the multi-processing part
def runMyODE(yn,tvec,allpara):

    return transpose(odeint(myODE, yn, tvec, args=(allpara,)))

tic = time.time()

pool = mp.Pool(processes=4)
results = [pool.apply(runMyODE, args=(INIT[combi,:],tval,{'para': PARA[combi,:]})) for combi in range(numComb)]

toc = time.time()

print 'Using apply took ', toc-tic, 'seconds!'

for sol in range(numComb):
    print results[sol][2,-1] #print final value of z

tic = time.time()    
resultsAsync = [pool.apply_async(runMyODE, args=(INIT[combi,:],tval,{'para': PARA[combi,:]})) for combi in range(numComb)]    
toc = time.time()
print 'Using apply_async took ', toc-tic, 'seconds!'

for sol in range(numComb):
    print resultsAsync[sol].get()[2,-1] #print final value of z

推荐答案

请注意，您的apply_async比for循环快289倍，这有点可疑！现在，即使不是最大并行度所需的结果，也可以保证按提交顺序获得结果.

Note that the fact that your apply_async is 289 times faster then the for loop is a little suspicious! And right now, you're guaranteed to get the results in the order they're submitted, even if that isn't what you want for maximum parallelism.

apply_async启动一个任务，它不会等到完成为止； .get()可以做到这一点.因此:

apply_async starts a task, it doesn't wait until it's completed; .get() does that. So this:

tic = time.time()    
resultsAsync = [pool.apply_async(runMyODE, args=(INIT[combi,:],tval,{'para': PARA[combi,:]})) for combi in range(numComb)]    
toc = time.time()

这并不是很公平的衡量标准；您已经开始了所有任务，但是它们不一定已经完成.

Isn't really a very fair measurement; you've started all the tasks, but they're not necessarily completed yet.

另一方面，一旦您获得.get()结果，就知道任务已完成并且您有答案；这样做

On the other hand, once you .get() the results, you know that the task has completed and that you have the answer; so doing this

for sol in range(numComb):
    print resultsAsync[sol].get()[2,-1] #print final value of z

表示确保您有顺序的结果(因为您正在按顺序通过ApplyResult对象并对其进行.get()处理)；但是您可能希望结果准备好后立即获得结果，而不是一次阻塞一次地等待.但这意味着您需要以一种或另一种方式用其参数标记结果.

Means that for sure you have the results in order (because you're going through the ApplyResult objects in order and .get()ing them); but you might want to have the results as soon as they're ready rather than doing a blocking wait on the steps one at a time. But that means you'd need to label the results with their parameters one way or another.

您可以使用回调函数在任务完成后保存结果，并返回参数和结果，以允许完全异步返回:

You can use callbacks to save the results once the tasks are done, and return the parameters along with the results, to allow completely asynchronous returns:

def runMyODE(yn,tvec,allpara):
    return allpara['para'],transpose(odeint(myODE, yn, tvec, args=(allpara,)))

asyncResults = []

def saveResult(result):
    asyncResults.append((result[0], result[1][2,-1]))

tic = time.time()
for combi in range(numComb):
    pool.apply_async(runMyODE, args=(INIT[combi,:],tval,{'para': PARA[combi,:]}), callback=saveResult)
pool.close()
pool.join()
toc = time.time()

print 'Using apply_async took ', toc-tic, 'seconds!'

for res in asyncResults:
    print res[0], res[1]

给您一个更合理的时间；结果几乎总是井井有条，因为任务花费的时间非常相似:

Gives you a more reasonable time; the results are still almost always in order because the tasks take very similar amounts of time:

Using apply took  0.0847041606903 seconds!
[ 6.02763376  5.44883183] 41.7597176061
[ 4.37587211  8.91773001] 48.0603437545
[ 7.91725038  5.2889492 ] 38.7413413879
[ 0.71036058  0.871293  ] 25.6022231983
[ 7.78156751  8.70012148] 46.4843604574
[ 4.61479362  7.80529176] 46.3495273394
[ 1.43353287  9.44668917] 50.9073202011
[ 2.64555612  7.74233689] 48.2603508573
[ 0.187898    6.17635497] 50.0502618731
[ 9.43748079  6.81820299] 41.7948313502
Using apply_async took  0.0259671211243 seconds!
[ 4.37587211  8.91773001] 48.0603437545
[ 0.71036058  0.871293  ] 25.6022231983
[ 6.02763376  5.44883183] 41.7597176061
[ 7.91725038  5.2889492 ] 38.7413413879
[ 7.78156751  8.70012148] 46.4843604574
[ 4.61479362  7.80529176] 46.3495273394
[ 1.43353287  9.44668917] 50.9073202011
[ 2.64555612  7.74233689] 48.2603508573
[ 0.187898    6.17635497] 50.0502618731
[ 9.43748079  6.81820299] 41.7948313502

请注意，您也可以使用map而不是套用遍历:

Note that rather than looping over apply, you could also use map:

pool.map_async(lambda combi: runMyODE(INIT[combi,:], tval, para=PARA[combi,:]), range(numComb), callback=saveResult)

这篇关于Python的多重处理:加快几套参数的for循环，“套用"与"apply_async"的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Python的多重处理:加快几套参数的for循环，“套用"与"apply_async" [英] Python's multiprocessing: speed up a for-loop for several sets of parameters, "apply" vs. "apply_async"

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Python的多重处理:加快几套参数的for循环，“套用"与"apply_async" [英] Python&#39;s multiprocessing: speed up a for-loop for several sets of parameters, &quot;apply&quot; vs. &quot;apply_async&quot;

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

Python的多重处理:加快几套参数的for循环，“套用"与"apply_async" [英] Python's multiprocessing: speed up a for-loop for several sets of parameters, "apply" vs. "apply_async"

登录关闭