在Python优化的点积 [英] Optimized dot product in Python

查看:177
本文介绍了在Python优化的点积的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

两个n维向量 U = [U1,U2,... UN] V = [V1,V2的点积,...,VN] 是由给U1 * V1 + U2 * V2 + ... + UN * VN <​​/ code>。

The dot product of two n-dimensional vectors u=[u1,u2,...un] and v=[v1,v2,...,vn] is is given by u1*v1 + u2*v2 + ... + un*vn.

一个问题<一href="http://stackoverflow.com/questions/1823293/optimized-method-for-calculating-cosine-distance-in-python/1823735#1823735">posted昨天鼓励我找计算仅使用标准库点的产品在Python的最快方法,没有第三方的模块或C / Fortran语言/ C ++调用。

A question posted yesterday encouraged me to find the fastest way to compute dot products in Python using only the standard library, no third-party modules or C/Fortran/C++ calls.

我计时四种不同的方法;到目前为止,似乎以最快的速度为总和(星图(MUL,izip(V1,V2)))(其中星图 izip 来自 itertools 模块)。

I timed four different approaches; so far the fastest seems to be sum(starmap(mul,izip(v1,v2))) (where starmap and izip come from the itertools module).

对于code presented以下,这些都是经过时间(以秒为单位,一百万运行):

For the code presented below, these are the elapsed times (in seconds, for one million runs):

d0: 12.01215
d1: 11.76151
d2: 12.54092
d3: 09.58523

你能想出一个更快的方法来做到这一点?

Can you think of a faster way to do this?

import timeit # module with timing subroutines                                                              
import random # module to generate random numnbers                                                          
from itertools import imap,starmap,izip
from operator import mul

def v(N=50,min=-10,max=10):
    """Generates a random vector (in an array) of dimension N; the                                          
    values are integers in the range [min,max]."""
    out = []
    for k in range(N):
        out.append(random.randint(min,max))
    return out

def check(v1,v2):
    if len(v1)!=len(v2):
        raise ValueError,"the lenght of both arrays must be the same"
    pass

def d0(v1,v2):
    """                                                                                                     
    d0 is Nominal approach:                                                                                 
    multiply/add in a loop                                                                                  
    """
    check(v1,v2)
    out = 0
    for k in range(len(v1)):
        out += v1[k] * v2[k]
    return out

def d1(v1,v2):
    """                                                                                                     
    d1 uses an imap (from itertools)                                                                        
    """
    check(v1,v2)
    return sum(imap(mul,v1,v2))

def d2(v1,v2):
    """                                                                                                     
    d2 uses a conventional map                                                                              
    """
    check(v1,v2)
    return sum(map(mul,v1,v2))

def d3(v1,v2):
    """                                                                                                     
    d3 uses a starmap (itertools) to apply the mul operator on an izipped (v1,v2)                           
    """
    check(v1,v2)
    return sum(starmap(mul,izip(v1,v2)))

# generate the test vectors                                                                                 
v1 = v()
v2 = v()

if __name__ == '__main__':

    # Generate two test vectors of dimension N                                                              

    t0 = timeit.Timer("d0(v1,v2)","from dot_product import d0,v1,v2")
    t1 = timeit.Timer("d1(v1,v2)","from dot_product import d1,v1,v2")
    t2 = timeit.Timer("d2(v1,v2)","from dot_product import d2,v1,v2")
    t3 = timeit.Timer("d3(v1,v2)","from dot_product import d3,v1,v2")

    print "d0 elapsed: ", t0.timeit()
    print "d1 elapsed: ", t1.timeit()
    print "d2 elapsed: ", t2.timeit()
    print "d3 elapsed: ", t3.timeit()

请注意,文件的名称必须为 dot_product.py 的脚本运行;我用的Python 2.5.1在Mac OS X版本10.5.8。

Notice that the name of the file must be dot_product.py for the script to run; I used Python 2.5.1 on a Mac OS X Version 10.5.8.

编辑:

我跑的剧本N = 1000,这是结果(以秒为单位,一百万运行):

I ran the script for N=1000 and these are the results (in seconds, for one million runs):

d0: 205.35457
d1: 208.13006
d2: 230.07463
d3: 155.29670

我想这是安全的假设,事实上,第三个选项是最快的方案(2)(四个presented)最慢的。

I guess it is safe to assume that, indeed, option three is the fastest and option two the slowest (of the four presented).

推荐答案

只是为了好玩,我写了D4,它采用numpy的:

Just for fun I wrote a "d4" which uses numpy:

from numpy import dot
def d4(v1, v2): 
    check(v1, v2)
    return dot(v1, v2)

我的结果(Python的2.5.1,Windows XP专业版SP3,2GHz的酷睿双核T7200):

My results (Python 2.5.1, XP Pro sp3, 2GHz Core2 Duo T7200):

d0 elapsed:  12.1977242918
d1 elapsed:  13.885232341
d2 elapsed:  13.7929552499
d3 elapsed:  11.0952246724

<打击> D4经过:56.3278584289#去numpy的

和,更有趣,我打开Psyco的:

And, for even more fun, I turned on psyco:

d0 elapsed:  0.965477735299
d1 elapsed:  12.5354792299
d2 elapsed:  12.9748163524
d3 elapsed:  9.78255448667

<打击> D4经过:54.4599059378

,我宣布D0赢家:)


更新

@ kaiser.se:我也许应该提到,我所做的一切转换为numpy的阵列第一:

@kaiser.se: I probably should have mentioned that I did convert everything to numpy arrays first:

from numpy import array
v3 = [array(vec) for vec in v1]
v4 = [array(vec) for vec in v2]

# then
t4 = timeit.Timer("d4(v3,v4)","from dot_product import d4,v3,v4")

和我包括检查(V1,V2),因为它包含在其他测试。离开它关闭会给numpy的不公平的优势(虽然它看起来像它可以使用一个)。该阵列转换剃掉约1秒钟(比我想象的还要少得多)。

And I included check(v1, v2) since it's included in the other tests. Leaving it off would give numpy an unfair advantage (though it looks like it could use one). The array conversion shaved off about a second (much less than I thought it would).

我所有的测试使用N = 50。

All of my tests were run with N=50.

@nikow:我使用numpy的1.0.4,这无疑是旧的,它肯定是可能的,因为我已经安装了它,他们已经提高了性能,在过去一年半的时间。

@nikow: I'm using numpy 1.0.4, which is undoubtedly old, it's certainly possible that they've improved performance over the last year and a half since I've installed it.


更新#2

@ kaiser.se哇,你是完全正确的。我必须一直在想,这些都是列表或某事的清单(我真的不知道我在想什么...... +1的结对编程)。

@kaiser.se Wow, you are totally right. I must have been thinking that these were lists of lists or something (I really have no idea what I was thinking ... +1 for pair programming).

请问这个样子:

v3 = array(v1)
v4 = array(v2)

新的结果:

d4 elapsed:  3.22535741274

使用Psyco的:

d4 elapsed:  2.09182619579

D0仍然Psyco的胜利,但numpy的可能是更好的整体,尤其是在更大的数据集。

d0 still wins with Psyco, but numpy is probably better overall, especially with larger data sets.

昨天我有点困扰我慢numpy的结果,因为presumably numpy的用于大量的运算的,并且有很多的优化。很明显,虽然,没有足够的麻烦,检查我的结果:)

Yesterday I was a bit bothered my slow numpy result, since presumably numpy is used for a lot of computation and has had a lot of optimization. Obviously though, not bothered enough to check my result :)

这篇关于在Python优化的点积的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆