向python中的重复索引向量化循环 [英] Vectorizing for loop with repeated indices in python

查看:361
本文介绍了向python中的重复索引向量化循环的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图优化一个被调用了很多(数百万次)的代码片段,所以任何类型的速度改进(希望移除for循环)都会很好。

我正在计算一些第j个粒子与其他粒子的相关函数。(b)(b)(b) r') - s_k(r))^ 2))在k上平均。

我的想法是有一个变量corrfun把数据分成一些bin(r,在别处定义)。我找到了每个s_k所属的bin,它存储在ind中。因此,ind [0]是j(0)点对应的r(因此是corrfun)的索引。多个点可以落入同一个bin(实际上我想bin要足够大来包含多个点),所以我将所有的(s_j(r') - s_k(r))^ 2相加,然后除以指向那个bin(保存在变量rw中)。我最终为此做的代码如下(np是numpy):

  for k,v in enumerate(ind ):
if j == k:
continue
corrfun [v] + =(s [k] -s [j])** 2
rw [v] + = 1
rw2 = rw
rw2 [rw < 1] = 1
corrfun = np.sqrt(np.divide(corrfun,rw2))



<请注意,rw2业务是因为我想避免0除以0的问题,但我确实返回rw数组,我想能够区分rw = 0和rw = 1元素。也许还有一个更优雅的解决方案。

有没有办法让for循环更快?虽然我不想添加自我交互(j == k),但是如果我的自我交互意味着能够快得多的计算,我甚至可以确定(ind〜1E6的长度,所以自我交互可能是微不足道的) p>

谢谢!

ilya

编辑:



以下是完整的代码。请注意,在完整的代码中,我也平均超过j。

  import numpy as np 

def twopointcorr(x,y,s,dr):

width = np.max(x)-np.min(x)
height = np.max(y)-np。 min(y)

n = len(x)

maxR = np.sqrt((width / 2)** 2 +(height / 2)** 2)
$ b $ = np.arange(0,maxR,dr)
print(r)
corrfun = r * 0
rw = r * 0
print (0,n-1):
hypot = np.sqrt((x [j] -x)
'''遍历所有点'''
' ** 2 +(y [j] -y)** 2)
ind = [np.abs(rh).argmin()for h in hypot]

for k,v枚举(ind):
if j == k:
continue
corrfun [v] + =(s [k] -s [j])** 2
rw [v] + = 1

rw2 = rw
rw2 [rw< 1] = 1
corrfun = np.sqrt(np.divide(corrfun,rw2))
return r,corrfun,rw

我以下面的方式调试它:

  from twopointcorr import twopointcorr 
import numpy as np
import matplotlib.pyplot as plt
import time
$ bn = 1000
x = np.random.rand(n)
y = np .random.rand(n)
s = np.random.rand(n)

print('running two point corr functinon')

start_time = time。 time()
r,corrfun,rw = twopointcorr(x,y,s,0.1)
print(---执行时间是%s秒---%(time.time() - start_time))

fig1 = plt.figure()
plt.plot(r,corrfun,' - x')

fig2 = plt.figure()
plt.plot(r,rw,' - x')
plt.show()

再一次,主要问题是在真实数据集n〜1E6中。当然,我可以重新采样,使其更小,但我很想实际上通过数据集。

解决方案

这是代码,使用广播,下降,一轮,bincount删除所有的循环:

pre $ def $ p $ c $ def twopointcorr2(x,y,s,dr ):
width = np.max(x)-np.min(x)
height = np.max(y)-np.min(y)
n = len(x)
maxR = np.sqrt((width / 2)** 2 +(height / 2)** 2)
r = np.arange(0,maxR,dr)
osub = lambda x :np.subtract.outer(x,x)

ind = np.clip(np.round(np.hypot(osub(x),osub(y))/ dr),0,len (r)-1).astype(int)
rw = np.bincount(ind.ravel())
rw [0] - = len(x)
corrfun = np.bincount (ind.ravel(),(osub(s)** 2).ravel())
return r,corrfun,rw

来比较,我修改你的代码如下:

  def twopointcorr(x,y ,s,dr):
width = np.max(x)-np.min(x)
height = np.max(y)-np.min( y)

n = len(x)

maxR = np.sqrt((width / 2)** 2 +(height / 2)** 2)$ b $对于范围(0,n)中的j,b
r = np.arange(0,maxR,dr)
corrfun = r * 0
rw = r * 0

hypot = np.sqrt((x [j] -x)** 2+(y [j] -y)** 2)
ind = [np.abs(rh).argmin()for h in hypot]
for k,v in enumerate(ind):
if j == k:
continue
corrfun [v] + =(s [k] -s [j])** 2
rw [v] + = 1

return r,corrfun,rw

这里是检查结果的代码:

pre $ import numpy as np

n = 1000
x = np.random.rand(n)
y = np.random.rand(n)
s = np.random.rand(n)

r1,corrfun1,rw1 = twopointcorr(x,y,s,0.1)
r2,corrfun2,rw2 = twopointcorr2(x,y,s,0.1)

assert np.allclose(r1,r2)
assert np.allclose(corrfun1,corrfun2)
assert np.allclose(rw1,rw2)

和%timeit结果:

 %timeit twopointcorr(x,y,s ,0.1)
%timeit twopointcorr2(x,y,s,0.1)



  1循环,最好3:每循环5.16 s 
10循环,最好3:每循环134 ms


I am trying to optimize a snippet that gets called a lot (millions of times) so any type of speed improvement (hopefully removing the for-loop) would be great.

I am computing a correlation function of some j'th particle with all others

C_j(|r-r'|) = sqrt(E((s_j(r')-s_k(r))^2)) averaged over k.

My idea is to have a variable corrfun which bins data into some bins (the r, defined elsewhere). I find what bin of r each s_k belongs to and this is stored in ind. So ind[0] is the index of r (and thus the corrfun) for which the j=0 point corresponds to. Multiple points can fall into the same bin (in fact I want bins to be big enough to contain multiple points) so I sum together all of the (s_j(r')-s_k(r))^2 and then divide by number of points in that bin (stored in variable rw). The code I ended up making for this is the following (np is for numpy):

for k, v in enumerate(ind):
        if j==k:
            continue
        corrfun[v] += (s[k]-s[j])**2
        rw[v] += 1
rw2 = rw
rw2[rw < 1] = 1
corrfun = np.sqrt(np.divide(corrfun, rw2))

Note, the rw2 business was because I want to avoid divide by 0 problems but I do return the rw array and I want to be able to differentiate between the rw=0 and rw=1 elements. Perhaps there is a more elegant solution for this as well.

Is there a way to make the for-loop faster? While I would like to not add the self interaction (j==k) I am even ok with having self interaction if it means I can get significantly faster calculation (length of ind ~ 1E6 so self interaction is probably insignificant anyways).

Thank you!

Ilya

Edit:

Here is the full code. Note, in the full code I am averaging over j as well.

import numpy as np

def twopointcorr(x,y,s,dr):

    width = np.max(x)-np.min(x)
    height = np.max(y)-np.min(y)

    n = len(x)

    maxR = np.sqrt((width/2)**2 + (height/2)**2)

    r = np.arange(0, maxR, dr)
    print(r)
    corrfun = r*0
    rw = r*0
    print(maxR)
    ''' go through all points'''
    for j in range(0, n-1):
        hypot = np.sqrt((x[j]-x)**2+(y[j]-y)**2)
        ind = [np.abs(r-h).argmin() for h in hypot]

        for k, v in enumerate(ind):
            if j==k:
                continue
            corrfun[v] += (s[k]-s[j])**2
            rw[v] += 1

    rw2 = rw
    rw2[rw < 1] = 1
    corrfun = np.sqrt(np.divide(corrfun, rw2))
    return r, corrfun, rw

I debug test it the following way

from twopointcorr import twopointcorr
import numpy as np
import matplotlib.pyplot as plt
import time

n=1000
x = np.random.rand(n)
y = np.random.rand(n)
s = np.random.rand(n)

print('running two point corr functinon')

start_time = time.time()
r,corrfun,rw = twopointcorr(x,y,s,0.1)
print("--- Execution time is %s seconds ---" % (time.time() - start_time))

fig1=plt.figure()
plt.plot(r, corrfun,'-x')

fig2=plt.figure()
plt.plot(r, rw,'-x')
plt.show()

Again, the main issue is that in the real dataset n~1E6. I can resample to make it smaller, of course, but I would love to actually crank through the dataset.

解决方案

Here is the code that use broadcast, hypot, round, bincount to remove all the loops:

def twopointcorr2(x, y, s, dr):
    width = np.max(x)-np.min(x)
    height = np.max(y)-np.min(y)
    n = len(x)
    maxR = np.sqrt((width/2)**2 + (height/2)**2)
    r = np.arange(0, maxR, dr)    
    osub = lambda x:np.subtract.outer(x, x)

    ind = np.clip(np.round(np.hypot(osub(x), osub(y)) / dr), 0, len(r)-1).astype(int)
    rw = np.bincount(ind.ravel())
    rw[0] -= len(x)
    corrfun = np.bincount(ind.ravel(), (osub(s)**2).ravel())
    return r, corrfun, rw

to compare, I modified your code as follows:

def twopointcorr(x,y,s,dr):
    width = np.max(x)-np.min(x)
    height = np.max(y)-np.min(y)

    n = len(x)

    maxR = np.sqrt((width/2)**2 + (height/2)**2)

    r = np.arange(0, maxR, dr)
    corrfun = r*0
    rw = r*0
    for j in range(0, n):
        hypot = np.sqrt((x[j]-x)**2+(y[j]-y)**2)
        ind = [np.abs(r-h).argmin() for h in hypot]
        for k, v in enumerate(ind):
            if j==k:
                continue
            corrfun[v] += (s[k]-s[j])**2
            rw[v] += 1

    return r, corrfun, rw        

and here is the code to check the results:

import numpy as np

n=1000
x = np.random.rand(n)
y = np.random.rand(n)
s = np.random.rand(n)

r1, corrfun1, rw1 = twopointcorr(x,y,s,0.1)
r2, corrfun2, rw2 = twopointcorr2(x,y,s,0.1)

assert np.allclose(r1, r2)
assert np.allclose(corrfun1, corrfun2)
assert np.allclose(rw1, rw2)        

and the %timeit results:

%timeit twopointcorr(x,y,s,0.1)
%timeit twopointcorr2(x,y,s,0.1)    

outputs:

1 loop, best of 3: 5.16 s per loop
10 loops, best of 3: 134 ms per loop

这篇关于向python中的重复索引向量化循环的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆