使用NumPy的所有行对的快速差异 [英] Fast differences of all row pairs with NumPy

查看:113
本文介绍了使用NumPy的所有行对的快速差异的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用一种算法,要求每个示例都具有一个矩阵,例如Xi,即ai x b,并且对于每个O(n^2)对示例,我发现每行Xiu - Xjv之间的差异,然后将外部乘积sum_u sum_v np.outer(Xiu - Xjv, Xiu - Xjv)加起来.

不幸的是,这个内部双重和相当慢,并且导致运行时间在大型数据集上失控.现在,我只是使用for循环来执行此操作.有一些pythonic的方法可以加快此内部操作的速度吗?我一直在想一想,却没有运气.

为澄清起见,对于每个n示例,都有一个尺寸为ai x b的矩阵Xi,其中每个示例的ai不同.对于每对(Xi, Xj),我想遍历两个矩阵之间的所有O(ai * bi)行对,然后找到Xiu - Xjv,将其外乘积与自身np.outer(Xiu - Xjv, Xiu - Xjv)相乘,最后将所有这些外乘积求和. /p>

例如:假设D为[[1,2],[3,4]],我们正在为两个矩阵使用它.

这就是说,第一步可能是np.outer(D[0] - D[1], D[0] - D[1]),这将是矩阵[4,4],[4,4].

就这么简单,(0,0)和(1,1)只是0个矩阵,而(0,1)和(1,0)都是4个矩阵,因此对的所有四个外积的最终和将会是[[8,8],[8,8]].

解决方案

好的,这很有趣.我仍然不禁以为,只需对numpy.tensordot进行一次巧妙的调用就可以完成所有操作,但是无论如何,这似乎已经消除了所有Python级别的循环:

import numpy

def slow( a, b=None ):

    if b is None: b = a
    a = numpy.asmatrix( a )
    b = numpy.asmatrix( b )

    out = 0.0
    for ai in a:
        for bj in b:
            dij = bj - ai
            out += numpy.outer( dij, dij )
    return out

def opsum( a, b=None ):

    if b is None: b = a
    a = numpy.asmatrix( a )
    b = numpy.asmatrix( b )

    RA, CA = a.shape
    RB, CB = b.shape    
    if CA != CB: raise ValueError( "input matrices should have the same number of columns" )

    out = -numpy.outer( a.sum( axis=0 ), b.sum( axis=0 ) );
    out += out.T
    out += RB * ( a.T * a )
    out += RA * ( b.T * b )
    return out

def test( a, b=None ):
    print( "ground truth:" )
    print( slow( a, b ) )
    print( "optimized:" )
    print( opsum( a, b ) )  
    print( "max abs discrepancy:" )
    print( numpy.abs( opsum( a, b ) - slow( a, b ) ).max() )
    print( "" )

# OP example
test( [[1,2], [3,4]] )

# non-symmetric example
a = [ [ 1, 2, 3 ], [-4, 5, 6 ], [7, -8, 9 ], [ 10, 11, -12 ] ]
a = numpy.matrix( a, dtype=float )
b = a[ ::2, ::-1 ] + 15
test( a, b )

# non-integer example
test( numpy.random.randn( *a.shape ), numpy.random.randn( *b.shape ) )

使用(几乎是任意的)示例输入,opsum的计时(在IPython中使用timeit opsum(a,b)测量)看起来仅比slow好3-5倍.但是当然可以更好地扩展:数据点的数量可以扩展100倍,要素的数量可以扩展10倍,然后我们已经在研究一个速度提高了10,000倍.

I am using an algorithm that requires that each example has a matrix, say Xi which is ai x b, and that for each of the O(n^2) pairs of examples, I find the difference between each row Xiu - Xjv, then sum the outer products sum_u sum_v np.outer(Xiu - Xjv, Xiu - Xjv).

Unfortunately this inner double sum is fairly slow, and is causing the running time to spiral out of control on large datasets. Right now I'm just using for loops to do this. Is there some pythonic way to speed up this inner operation? I have been trying to think one up and having no luck.

To clarify, for each of the n examples, there's a matrix Xi with dimensions ai x b where ai is different for each example. For each pair (Xi, Xj) I want to go through all the O(ai * bi) pairs of rows between the two matrices and find Xiu - Xjv, take the outer product of that with itself np.outer(Xiu - Xjv, Xiu - Xjv), and finally sum all those outer products.

For instance: Suppose D is [[1,2],[3,4]] and we're just working with that for both matrices.

Then i.e. one step might be np.outer(D[0] - D[1], D[0] - D[1]) which would be the matrix [4,4],[4,4].

Simply enough, (0,0) and (1,1) are just 0 matrices, and (0,1) and (1,0) are both 4 matrices, so the final sum of all four outer products of pairs would be [[8,8],[8,8]].

解决方案

OK this one was fun. I still can't help thinking it can all be done with a single ingenious call to numpy.tensordot but at any rate this seems to have eliminated all Python-level loops:

import numpy

def slow( a, b=None ):

    if b is None: b = a
    a = numpy.asmatrix( a )
    b = numpy.asmatrix( b )

    out = 0.0
    for ai in a:
        for bj in b:
            dij = bj - ai
            out += numpy.outer( dij, dij )
    return out

def opsum( a, b=None ):

    if b is None: b = a
    a = numpy.asmatrix( a )
    b = numpy.asmatrix( b )

    RA, CA = a.shape
    RB, CB = b.shape    
    if CA != CB: raise ValueError( "input matrices should have the same number of columns" )

    out = -numpy.outer( a.sum( axis=0 ), b.sum( axis=0 ) );
    out += out.T
    out += RB * ( a.T * a )
    out += RA * ( b.T * b )
    return out

def test( a, b=None ):
    print( "ground truth:" )
    print( slow( a, b ) )
    print( "optimized:" )
    print( opsum( a, b ) )  
    print( "max abs discrepancy:" )
    print( numpy.abs( opsum( a, b ) - slow( a, b ) ).max() )
    print( "" )

# OP example
test( [[1,2], [3,4]] )

# non-symmetric example
a = [ [ 1, 2, 3 ], [-4, 5, 6 ], [7, -8, 9 ], [ 10, 11, -12 ] ]
a = numpy.matrix( a, dtype=float )
b = a[ ::2, ::-1 ] + 15
test( a, b )

# non-integer example
test( numpy.random.randn( *a.shape ), numpy.random.randn( *b.shape ) )

With that (rather arbitrary) example input, timing of opsum (measured using timeit opsum(a,b) in IPython) looks only about a factor of 3–5 better than slow. But of course it scales much better: scale up the numbers of data-points by a factor of 100, and the number of features by a factor of 10, and then we're already looking at about a factor-10,000 increase in speed.

这篇关于使用NumPy的所有行对的快速差异的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆