更有效的方式来循环? [英] More efficient way to loop?

查看:141
本文介绍了更有效的方式来循环?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个更大的脚本的一小段代码。我发现当函数 t_area 被调用时,它负责大部分运行时间。我自己测试了这个函数,而且速度不是很慢,因为我相信它的运行次数要花很多时间。这里是调用函​​数的代码:

$ $ $ $ $ $ $ $ $ $ $> (0,numy-1):
在范围内(0,numx-1):

xp = x [ii,jj]
yp = y [ii,jj]
zp = surface [ii,jj]
ap = np.array((xp,yp,zp))

xp = xp + dx
zp =面[ii + 1,jj]
bp = np.array((xp,yp,zp))

yp = yp + dx
zp =表面[ii + 1,jj + 1]
dp = np.array((xp,yp,zp))

xp = xp-dx
zp = surface [ii ,(jp + 1)
cp = np.array((xp,yp,zp))

tri_area [ii,jj] = t_area(ap,bp,cp,dp)

这里使用的数组的大小是 216 x 217 ,所以 x y 的值也是如此。我对python编码很新,我以前用过MATLAB。所以我的问题是,有没有办法绕过两个for循环,或者一个更有效的方式来运行这个代码块?寻找任何帮助,加快这一点!谢谢!

编辑:

感谢大家的帮助,这已经清除了很多混乱。我被问到在循环中使用的函数t_area,下面是代码:

  def t_area(a,b ,c,d):
ab = ba
ac = ca
tri_area_a = 0.5 * linalg.norm(np.cross(ab,ac))

db = bd
dc = cd
tri_area_d = 0.5 * linalg.norm(np.cross(db,dc))

ba = ab
bd = db
tri_area_b = 0.5 * linalg.norm(np.cross(ba,bd))

ca = ac
cd = dc
tri_area_c = 0.5 * linalg.norm(np .cross(ca,cd))

av_area =(tri_area_a + tri_area_b + tri_area_c + tri_area_d)* 0.5
return(av_area)
pre>

对不起,这个令人困惑的符号,当时有意义,现在回头看看,我可能会改变它。谢谢!

解决方案

range(0,numy-1),它等于 range(numy-1) 0到numy-2,不包括numy-1。这是因为你有numy-1值从0到numy-2。虽然MATLAB有基于1的索引,但是Python基于0,因此在转换时要注意索引。考虑到你有 tri_area = np.zeros((numx,numy),dtype = float) tri_area [ii,jj] 不会以您设置循环的方式访问最后一行或一列。因此,我怀疑正确的意图是写 range(numy)



由于函数 t_area()是vectorisable,你可以完全避免循环。 Vectorisation意味着numpy同时对整个数组应用一些操作,通过照顾引擎盖下的循环,它们会更快。

首先,我们将所有(m,n,3)数组中的每个(i,j)元素的 ap s,其中(m,n)是 X 。如果我们取两个(m,n,3)数组的交叉乘积,则默认情况下,该操作将应用于最后一个轴上。这意味着 np.cross(a,b)对于每个元素(i,j)都会执行中的3个数字的交叉乘积c> a [i,j] 和 b [i,j] 。类似地,对于每个元素(i,j), np.linalg.norm(a,axis = 2)将会为计算 A [I,J] 。这也将有效地减少我们的阵列的大小(m,n)。这里有点小心,因为我们需要明确说明我们希望在第二轴上完成这个操作。



请注意,在下面的示例中,我的索引关系可能不对应给你。这个工作的最低限度是 surface x 和<$ c $有一个额外的行和列

  import numpy as np 

def _t_area( a,b,c):
ab = b - a
ac = c - a
返回0.5 * np.linalg.norm(np.cross(ab,ac),axis = 2 )
$ b $ def t_area(x,y,surface,dx):
a = np.zeros((x.shape [0],y.shape [0],3),dtype = (a)
b = np.zeros_like(a)
c = np.zeros_like(a)
d = np.zeros_like(a)

a [...,0] = x
a [...,1] = y
a [...,2] =表面[: - 1,: - 1]

b [..., 0] = x + dx
b [...,1] = y
b [...,2] = surface [1:,: - 1]

c [ ...,0] = x
c [...,1] = y + dx
c [...,2] = surface [: - 1,1:]

d [...,0] = bp [...,0]
d [...,1] = cp [...,1]
d [...,2] = surface [1:,1:]

#你确定你的意思不是0.25? (b,c)+ _t_area(b,a,d)+ _t_area(c,a,d))

nx,ny = 250,250

dx = np.random.random()
x = np.random.random((nx,ny))
y = np。 random.random((nx,ny))
surface = np.random.random((nx + 1,ny + 1))

tri_area = t_area(x,y,surface, dx)

x 指数0-249,而表面 0-250。 surface [: - 1] surface [0:-1] 的简写将返回从0和最后一个,但不包括它。在MATLAB中 -1 提供相同的函数, end 。因此, surface [: - 1] 将返回索引0-249的行。类似地, surface [1:] 将返回索引1-250的行,这与您的表面[ii + 1] code $。



注:那么 t_area()可以被完全矢量化。所以,尽管这里的内容已经过时了,但我仍然将其作为遗产来展示如果函数不是vectorisable可以进行的优化。



不要为每个元素调用昂贵的函数,而应该将它传递给 x y, surface dx 并在内部进行迭代。这意味着只有一个函数调用和更少的开销。



另外,你不应该为 ap 创建一个数组, bp cp dp 每个循环,增加开销。在循环外分配一次,并更新它们的值。

最后一个改变应该是循环的顺序。 Numpy数组在默认情况下是主行(而MATLAB是列主要的),所以 ii 在外循环中表现更好。你不会注意到你的大小的数组的差异,但嘿,为什么不呢?
$ b $总的来说,修改后的函数应该是这样。

  def t_area(x,y,surface,dx):
#我假设numx == x.shape [0]。如果不是的话,把它作为一个额外的论据。
tri_area = np.zeros(x.shape,dtype = float)

ap = np.zeros((3,),dtype = float)
bp = np.zeros_like (ap)
cp = np.zeros_like(ap)
dp = np.zeros_like(ap)

在范围内(x.shape [0] -1) #你真的想要范围(numx-1)还是范围(numx)? (x.shape [1] -1):
xp = x [ii,jj]
yp = y [ii,jj]
zp = surface [ii,jj]
ap [:] =(xp,yp,zp)

#以类似的方式获得`bp`,`cp`和`dp`并计算`tri_area [ii,jj]`


I have a small piece of code from a much larger script. I figured out that when the function t_area is called, it is responsible for most of the run time. I tested the function by itself, and it is not slow, it takes a lot of time because of the number of times that it has to be ran I believe. Here is the code where the function is called:

tri_area = np.zeros((numx,numy),dtype=float)
for jj in range(0,numy-1):
    for ii in range(0,numx-1):
      xp = x[ii,jj]
      yp = y[ii,jj]
      zp = surface[ii,jj]
      ap = np.array((xp,yp,zp))

      xp = xp+dx
      zp = surface[ii+1,jj]
      bp = np.array((xp,yp,zp))

      yp = yp+dx
      zp = surface[ii+1,jj+1]
      dp = np.array((xp,yp,zp))

      xp = xp-dx
      zp = surface[ii,jj+1]
      cp = np.array((xp,yp,zp))

      tri_area[ii,jj] = t_area(ap,bp,cp,dp)

The size of the arrays in use here are 216 x 217, and so are the values of x and y. I am pretty new to python coding, I have used MATLAB in the past. So my question is, is there a way to get around the two for-loops, or a more efficient way to run through this block of code in general? Looking for any help speeding this up! Thanks!

EDIT:

Thanks for the help everyone, this has cleared alot of confusion up. I was asked about the function t_area that is used in the loop, here is the code below:

def t_area(a,b,c,d):
ab=b-a
ac=c-a
tri_area_a = 0.5*linalg.norm(np.cross(ab,ac))

db=b-d
dc=c-d
tri_area_d = 0.5*linalg.norm(np.cross(db,dc))

ba=a-b
bd=d-b
tri_area_b = 0.5*linalg.norm(np.cross(ba,bd))

ca=a-c
cd=d-c
tri_area_c = 0.5*linalg.norm(np.cross(ca,cd))

av_area = (tri_area_a + tri_area_b + tri_area_c + tri_area_d)*0.5
return(av_area)

Sorry for the confusing notation, at the time it made sense, looking back now I will probably change it. Thanks!

解决方案

A caveat before we start. range(0, numy-1), which is equal to range(numy-1), produces the numbers from 0 to numy-2, not including numy-1. That's because you have numy-1 values from 0 to numy-2. While MATLAB has 1-based indexing, Python has 0-based, so be a bit careful with your indexing in the transition. Considering you have tri_area = np.zeros((numx, numy), dtype=float), tri_area[ii,jj] never accesses the last row or column with the way you have set up your loops. Therefore, I suspect the correct intention was to write range(numy).

Since the fuction t_area() is vectorisable, you can do away with the loops completely. Vectorisation means numpy applies some operations on a whole array at the same time by taking care of the loops under the hood, where they will be faster.

First, we stack all the aps for each (i, j) element in a (m, n, 3) array, where (m, n) is the size of x. If we take the cross product of two (m, n, 3) arrays, the operation will be applied on the last axis by default. This means that np.cross(a, b) will do for every element (i, j) take the cross product of the 3 numbers in a[i,j] and b[i,j]. Similarly, np.linalg.norm(a, axis=2) will do for every element (i, j) calculate the norm of the 3 numbers in a[i,j]. This will also effectively reduce our array to size (m, n). A bit of caution here though, as we need to explicitly state we want this operation done on the 2nd axis.

Note that in the following example my indexing relationship may not correspond to yours. The bare minimum to make this work is for surface to have one extra row and column from x and y.

import numpy as np

def _t_area(a, b, c):
    ab = b - a
    ac = c - a
    return 0.5 * np.linalg.norm(np.cross(ab, ac), axis=2)

def t_area(x, y, surface, dx):
    a = np.zeros((x.shape[0], y.shape[0], 3), dtype=float)
    b = np.zeros_like(a)
    c = np.zeros_like(a)
    d = np.zeros_like(a)

    a[...,0] = x
    a[...,1] = y
    a[...,2] = surface[:-1,:-1]

    b[...,0] = x + dx
    b[...,1] = y
    b[...,2] = surface[1:,:-1]

    c[...,0] = x
    c[...,1] = y + dx
    c[...,2] = surface[:-1,1:]

    d[...,0] = bp[...,0]
    d[...,1] = cp[...,1]
    d[...,2] = surface[1:,1:]

    # are you sure you didn't mean 0.25???
    return 0.5 * (_t_area(a, b, c) + _t_area(d, b, c) + _t_area(b, a, d) + _t_area(c, a, d))

nx, ny = 250, 250

dx = np.random.random()
x = np.random.random((nx, ny))
y = np.random.random((nx, ny))
surface = np.random.random((nx+1, ny+1))

tri_area = t_area(x, y, surface, dx)

x in this example supports the indices 0-249, while surface 0-250. surface[:-1], a shorthand for surface[0:-1], will return all rows starting from 0 and up to the last one, but not including it. -1 serves the same function and end in MATLAB. So, surface[:-1] will return the rows for indices 0-249. Similarly, surface[1:] will return the rows for indices 1-250, which achieves the same as your surface[ii+1].


Note: I had written this section before it was known that t_area() could be fully vectorised. So while what is here is obsolete for the purposes of this answer, I'll leave it as legacy to show what optimisations could have been made had the function not be vectorisable.

Instead of calling the function for each element, which is expensive, you should pass it x, y,, surface and dx and iterate internally. That means only one function call and less overhead.

Furthermore, you shouldn't create an array for ap, bp, cp and dp every loop, which again, adds overhead. Allocate them once outside the loop and just update their values.

One final change should be the order of loops. Numpy arrays are row major by default (while MATLAB is column major), so ii performs better as the outer loop. You wouldn't notice the difference for arrays of your size, but hey, why not?

Overall, the modified function should look like this.

def t_area(x, y, surface, dx):
    # I assume numx == x.shape[0]. If not, pass it as an extra argument.
    tri_area = np.zeros(x.shape, dtype=float)

    ap = np.zeros((3,), dtype=float)
    bp = np.zeros_like(ap)
    cp = np.zeros_like(ap)
    dp = np.zeros_like(ap)

    for ii in range(x.shape[0]-1): # do you really want range(numx-1) or just range(numx)?
        for jj in range(x.shape[1]-1):
            xp = x[ii,jj]
            yp = y[ii,jj]
            zp = surface[ii,jj]
            ap[:] = (xp, yp, zp)

            # get `bp`, `cp` and `dp` in a similar manner and compute `tri_area[ii,jj]`

这篇关于更有效的方式来循环?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆