用Numpy进行外部减法 [英] Outer subtraction with Numpy

查看:168
本文介绍了用Numpy进行外部减法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我只想这样做:C_i = \ Sum_k(A_i -B_k)^ 2我看到,使用简单的 for循环比使用 numpy.subtract.outer 的计算速度更快!无论如何,我觉得 numpy.einsum 最快.我不太了解 numpy.einsum .有人可以帮我吗?另外,如果有人解释如何用 numpy.einsum 编写由向量/矩阵组成的一般求和表达式,那将是很好的?

I simply want to do: C_i=\Sum_k (A_i -B_k)^2 I saw that this calculation is faster with a simple for loop than with the numpy.subtract.outer! Anyway I feel that numpy.einsum would be the fastest. I could not understand numpy.einsum that well. Can anyone please help me out? Additionally, it would be great if someone explains how a general summation expression consisting of vector/matrices can be written with numpy.einsum?

我没有在网上找到针对此特定问题的解决方案.很抱歉,如果以某种方式重复.

I did not find solution for this particular problem on the web. Sorry if duplicate in some manner.

具有循环和 numpy.subtract.outer -

A)有循环

import timeit
code1="""
import numpy as np

N=10000;

a=np.random.rand(N); b=10*(np.random.rand(N)-0.5);

def A1(x,y):
    Nx=len(x)
    z=np.zeros(Nx)
    for i in np.arange(Nx):
        z[i]=np.sum((x[i]-y)*(x[i]-y))

    return z

C1=A1(a,b)"""
elapsed_time = timeit.timeit(code1, number=10)/10
print "time=", elapsed_time

B)使用 numpy.subtract.outer

import timeit
code1="""
import numpy as np

N=10000;

a=np.random.rand(N); b=10*(np.random.rand(N)-0.5);

def A2(x,y):
    C=np.subtract.outer(x,y);
    return np.sum(C*C, axis=1)

C2=A2(a,b)"""
elapsed_time = timeit.timeit(code1, number=10)/10
print "time=", elapsed_time

对于N = 10000,循环变得更快.对于N = 100,外部减法变得更快.对于N = 10 ^ 5,外部减法使用8GB内存在我的桌面上面临内存问题!

For N=10000 the loop becomes faster. For N=100, the outer subtract becomes faster. For N=10^5, outer subtract faces memory issue on my desktop with 8GB ram!

推荐答案

至少使用Numba或Fortran实现

这两个函数都很慢.Python循环效率极低(A1),而分配大型临时数组的速度也很慢(A2,部分也为A1).

Use at least Numba, or a Fortran Implementation

Both of your functions are quite slow. Python loops are very inefficient (A1), and allocating large temporary arrays is also slow (A2 and partially also A1).

针对小型数组的朴素Numba实现

import numba as nb
import numpy as np

@nb.njit(parallel=True, fastmath=True)
def A_nb_p(x,y):

    z=np.empty(x.shape[0])
    for i in nb.prange(x.shape[0]):
        TMP=0.
        for j in range(y.shape[0]):
            TMP+=(x[i]-y[j])**2
        z[i]=TMP

    return z

时间

import time
N=int(1e5)
a=np.random.rand(N)
b=10*(np.random.rand(N)-0.5)

t1=time.time()
res_1=A1(a,b)
print(time.time()-t1)
#95.30195426940918 s

t1=time.time()
res_2=A_nb_p(a,b)
print(time.time()-t1)
#0.28573083877563477 s

#A2 is too slow to measure

如上所述,这是对大型数组的幼稚实现,因为Numba无法按块进行计算,这会导致大量缓存丢失,从而导致性能下降.某些Fortran或C-编译器应该至少能够自动执行以下优化(逐块评估).

As written above this is a naive implementation on larger arrays, since Numba isn't able to do the calculation blockwise, which leads to a lot of cache misses and therefore bad performance. Some Fortran or C- compiler should be able to do at least the following optimization (block-wise evaluation) automatically.

大型阵列的实现

@nb.njit(parallel=True, fastmath=True)
def A_nb_p_2(x,y):
    blk_s=1024
    z=np.zeros(x.shape[0])
    num_blk_x=x.shape[0]//blk_s
    num_blk_y=y.shape[0]//blk_s

    for ii in nb.prange(num_blk_x):
        for jj in range(num_blk_y):
            for i in range(blk_s):
                TMP=z[ii*blk_s+i]
                for j in range(blk_s):
                    TMP+=(x[ii*blk_s+i]-y[jj*blk_s+j])**2
                z[ii*blk_s+i]=TMP

    for i in nb.prange(x.shape[0]):
        TMP=z[i]
        for j in range(num_blk_y*blk_s,y.shape[0]):
            TMP+=(x[i]-y[j])**2
        z[i]=TMP

    for i in nb.prange(num_blk_x*blk_s,x.shape[0]):
        TMP=z[i]
        for j in range(num_blk_y*blk_s):
            TMP+=(x[i]-y[j])**2
        z[i]=TMP

    return z

时间

N=int(2*1e6)
a=np.random.rand(N)
b=10*(np.random.rand(N)-0.5)

t1=time.time()
res_1=A_nb_p(a,b)
print(time.time()-t1)
#298.9394392967224

t1=time.time()
res_2=A_nb_p_2(a,b)
print(time.time()-t1)
#70.12

这篇关于用Numpy进行外部减法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆