Python:加快我的Runge-Kutta集成代码挑战 [英] Python : Speeding up my Runge-Kutta integration code challenge

查看:138
本文介绍了Python:加快我的Runge-Kutta集成代码挑战的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用随附的代码集成 Fitzhugh-Nagumo 型号:

I am using the attached code to integrate a version of Fitzhugh-Nagumo model :

from scipy.integrate import odeint
import numpy as np
import time

P = {'epsilon':0.1,
     'a1':1.0,
     'a2':1.0,
     'b':2.0,
     'c':0.2}

def fhn_rhs(V,t,P):
    u,v = V[0],V[1]
    u_t = u - u**3 - v
    v_t = P['epsilon']*(u - P['b']*v - P['c'])
    return np.stack((u_t,v_t))

def integrate(func,V0,t,args,step='RK4'):
    start = time.clock()
    P = args[0]
    result=[V0]
    for i,tmp in enumerate(t[1:]):
        result.append(RK4step(func,result[i],tmp,P,(t[i+1]-t[i])))
    print "Integration took ",time.clock() - start, " s"
    return np.array(result)


def RK4step(rhs,V,t,P,dt):
    k_1 = dt*rhs(V,t,P)
    k_2 = dt*rhs((V+(1.0/2.0)*k_1),t,P)
    k_3 = dt*rhs((V+(1.0/2.0)*k_2),t,P)
    k_4 = dt*rhs((V+k_3),t,P)
    return V+(1.0/6.0)*k_1+(1.0/3.0)*k_2+(1.0/3.0)*k_3+(1.0/6.0)*k_4

在我的integratescipy.integrate.odeint之间进行比较可以得出以下结论:

Comparing between my integrate and scipy.integrate.odeint gives the following:

In [8]: import cProfile

In [9]: %timeit integrate(fhn_rhs,np.stack((0.1,0.2)),np.linspace(0,100,1000),args=(P,))
10 loops, best of 3: 36.4 ms per loop

In [10]: %timeit odeint(fhn_rhs,np.stack((0.1,0.2)),np.linspace(0,100,1000),args=(P,))
100 loops, best of 3: 3.45 ms per loop

In [11]: cProfile.run('integrate(fhn_rhs,np.stack((0.1,0.2)),np.linspace(0,100,1000),args=(P,))')
         45972 function calls in 0.098 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.098    0.098 <string>:1(<module>)
     3996    0.011    0.000    0.078    0.000 fhn.py:20(fhn_rhs)
        1    0.002    0.002    0.097    0.097 fhn.py:42(integrate)
      999    0.016    0.000    0.094    0.000 fhn.py:52(RK4step)
        1    0.000    0.000    0.000    0.000 function_base.py:9(linspace)
     7994    0.011    0.000    0.021    0.000 numeric.py:484(asanyarray)
     3997    0.029    0.000    0.067    0.000 shape_base.py:282(stack)
    11991    0.008    0.000    0.008    0.000 shape_base.py:337(<genexpr>)
     3997    0.002    0.000    0.002    0.000 {len}
      999    0.001    0.000    0.001    0.000 {method 'append' of 'list' objects}
        1    0.000    0.000    0.000    0.000 {method 'astype' of 'numpy.ndarray' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        1    0.000    0.000    0.000    0.000 {numpy.core.multiarray.arange}
     7995    0.010    0.000    0.010    0.000 {numpy.core.multiarray.array}
     3997    0.006    0.000    0.006    0.000 {numpy.core.multiarray.concatenate}
        1    0.000    0.000    0.000    0.000 {numpy.core.multiarray.result_type}

关于如何使我的代码运行更快的任何建议?我可以以某种方式使用numba进行加速吗?

Any suggestions on how I can make my code run faster? Can I use numba somehow to accelerate it?

推荐答案

您没有比较相同的东西.要查看odeint在什么点上实际评估ODE函数,请在其中放入print t语句(当然不计时). odeint以及通常具有自适应时间步长的方法会生成一个稀疏的积分样本列表,并从中插入所需的输出.

You are not comparing the same things. To see at what points odeint actually evaluates the ODE function, put a print t statement in (of course while not timing it). odeint and generally methods with adaptive time steps produce a sparse list of integration samples and interpolate the desired output from them.

对于RK4方法,您将不得不使用一个误差估计器,并在此基础上复制该自适应方案.

You would have to use an error estimator for the RK4 method and based on that replicate this adaptive scheme.

当然,使用矢量对象解释的python代码在执行过程中将与使用简单数组从odeint调用的lsoda的FORTRAN编译代码永远竞争.

And of course, interpreted python code using vector objects will never be competitive with the compiled FORTRAN code of lsoda called from odeint using simple arrays during its execution.

在带有插值的自适应步长方案中使用RK4的示例:

An example for using RK4 in an adaptive step size scheme with interpolation:

def RK4Step(f, x, y, h, k1):
    k2=f(x+0.5*h, y+0.5*h*k1)
    k3=f(x+0.5*h, y+0.5*h*k2)
    k4=f(x+    h, y+    h*k3)
    return (k1+2*(k2+k3)+k4)/6.0

def RK4TwoStep(f, x, y, h, k1):
    step1 = RK4Step(f, x , y , 0.5*h, k1        )
    x1, y1 = x+0.5*h, y+0.5*h*step1;
    step2 = RK4Step(f, x1, y1, 0.5*h, f(x1, y1) )
    return (step1+step2)/2

def RK4odeint(fin,y0,times, tol=1e-6, args=None):
    # numpy-ify the inputs
    if args:
        f = lambda t,y : np.array(fin(y,t,*args));
    else:
        f = lambda t,y : np.array(fin(y,t))
    y0 = np.array(y0)
    # allocate output structure
    yout = np.array([y0]*len(times));
    # in consequence, yout[0] = y0;
    # initialize integrator variables
    h = times[1]-times[0];
    hmax = abs(times[-1]-times[0]);

    # last and current point of the numerical integration
    ycurr = ylast = qcurr = qlast = y0; 
    tcurr = tlast = times[0];
    fcurr = flast = f(tcurr, ycurr);
    totalerr = 0.0
    totalvar = 0.0
    for i, t in enumerate(times[1:]):
        # remember that t == t[i+1], result goes to yout[i+1]
        while (t-tcurr)*h>0:
            # advance the integration                
            k1, k2 = RK4Step(f,tcurr,ycurr,h, fcurr), RK4TwoStep(f,tcurr,ycurr,h, fcurr);
            # RK4 is of fourth order, that is,
            # k1 = (y(x+h)-y(x))/h + C*h^4
            # k2 = (y(x+h)-y(x))/h + C*h^4/16
            # Using the double step k2 gives  
            # C*h^4/16 = (k2-k1)/15 as local error density
            # change h to match the global relative error density tol
            # use |k2| as scale for the absolute error
            # |k1-k2|/15*hfac^4 = tol*|k2|, h <- h*hfac

            scale = max(abs(k2))
            steperr = max(abs(k1-k2))/2
            # compute the ideal step size factor and sanitize the result to prevent ridiculous changes
            hfac = (  tol*scale / ( 1e-16+steperr)  )**0.25
            hfac = min(10, max(0.01, hfac) )

            # repeat the step if there is a significant step size correction
            if ( abs(h*hfac)<hmax and (0.6 > hfac or hfac > 3 )):
                # recompute with new step size
                h *= hfac;
                k2 = RK4TwoStep(f, tcurr, ycurr, h, fcurr) ;
            # update and cycle the integration points
            ylast = ycurr; ycurr = ycurr + h*k2;
            tlast = tcurr; tcurr += h;
            flast = fcurr; fcurr = f(tcurr, ycurr);
            # cubic Bezier control points
            qlast = ylast + (tcurr-tlast)/3*flast;
            qcurr = ycurr - (tcurr-tlast)/3*fcurr;

            totalvar += h*scale;
            totalerr = (1+h*scale)*totalerr + h*steperr;
            reportstr = "internal step to t=%12.8f \t" % tcurr;

        #now tlast <= t <= tcurr, can interpolate the value for yout[i+1] using the cubic Bezier formula
        s = (t - tlast)/(tcurr - tlast);
        yout[i+1] = (1-s)**2*((1-s)*ylast + 3*s*qlast) + s**2*(3*(1-s)*qcurr + s*ycurr)

    return np.array(yout)

可以类似于scipy.integrate.odeint来调用,具有与衍生函数相同的参数顺序约定(接口是根据该要求构造的,没有内在的必要性,可以根据需要进行更改)

This can be called similar to scipy.integrate.odeint with the same argument order conventions for the derivatives function (the interface was constructed with that requirement, there is no inherent necessity, change as you like)

P = {'epsilon':0.1,
     'a1':1.0,
     'a2':1.0,
     'b':2.0,
     'c':0.2}

def fhn_rhs(V,t,P):
    u,v = V
    u_t = u - u**3 - v
    v_t = P['epsilon']*(u - P['b']*v - P['c'])
    return np.array([u_t,v_t])


V0 = np.array([0.1,0.2])
time = np.linspace(0,100,1000)

V = RK4odeint(fhn_rhs,V0,time,args=(P,))

u,v = V.T
plt.plot(time, v) # or plt.plot(u,v) or ...

这篇关于Python:加快我的Runge-Kutta集成代码挑战的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆