我已经错位用Cython厉害,它的表现比纯Python差。为什么? [英] I've mangled Cython badly, it's performing worse than pure Python. Why?

查看:123
本文介绍了我已经错位用Cython厉害,它的表现比纯Python差。为什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是相当新的Python和绝对无知C(不幸),所以我努力正确理解与用Cython工作的某些方面。的

I'm rather new to Python and absolutely ignorant of C (unfortunately) so I am struggling to properly understand some aspects of working with Cython.

分析一个Python程序,发现这只是一对夫妇的占用了大部分时间的循环后,我决定寻找到他们倾倒用Cython。起初,我只是让用Cython间preT Python的,因为它是,结果是一个(了不起!)〜2倍的速度提升。太酷了!

After profiling a Python program and discovering that it was just a couple of loops that were hogging most of the time, I decided to look into dumping them into Cython. Initially, I just let Cython interpret the Python as it was, and the result was a (remarkable!) ~2x speed boost. Cool!

从Python的主力,我传递函数的两个2-D阵列(A和B)和一个浮动,D,它返回一个列表,newlist。作为例子:

From the Python main, I pass the function two 2-D arrays ("a" and "b") and a float, "d", and it returns a list, "newlist". As examples:

a =[[12.7, 13.5, 1.0],[23.4, 43.1, 1.0],...]
b =[[0.46,0.95,0],[4.56,0.92,0],...]
d = 0.1

下面是原来的code,添加了用Cy​​thon只是cdefs:

Here is the original code, with just the cdefs added for Cython:

def loop(a, b, d):

    cdef int i, j
    cdef double x, y

    newlist = []

    for i in range(len(a)):
        if b[i][2] != 1:
            for j in range(i+1,len(a)):
                if a[i] == a[j] and b[j][2] != 1:
                    x = b[i][0]+b[j][0]
                    y = b[i][1]+b[j][1]
                    b[i][2],b[j][2] = 1,1

                    if abs(y)/abs(x) > d:
                        if y > 0: newlist.append([a[i][0],a[i][1],y])

    return newlist

在〜12.5S纯Python,这RAN(有几个一万环路)。在用Cython它〜6.3S跑了。巨大的进步与完成的接近零的工作!

In "pure Python", this ran (with several ten-thousand loops) in ~12.5s. In Cython it ran in ~6.3s. Great progress with near-zero work done!

然而,随着一点点的阅读,很明显,很多,很多可以做,于是我开始尝试应用一些类型的变化得到的东西去得更快,继用Cython文档,的此处(在评论中也引用)。

However, with a little reading, it was clear that much, much more could be done, so I set about trying to apply some type changes to get things going even faster, following the Cython docs, here (also referenced in the comments).

下面是收集到的修改,意在模仿用Cython文档:

Here are the collected modifications, meant to mimic the Cython docs:

import numpy as np
cimport numpy as np

DtypeA = np.float
DtypeB = np.int

ctypedef np.float_t DtypeA_t
ctypedef np.int_t DtypeB_t

def loop(np.ndarray[DtypeA_t, ndim=2] A,
         np.ndarray[DtypeA_t, ndim=2] B,
         np.ndarray[DtypeB_t, ndim=1] C,
         float D):

    cdef Py_ssize_t i, j
    cdef float x, y

    cdef np.ndarray[DtypeA_t, ndim=2] NEW_ARRAY = np.zeros((len(C),3), dtype=DtypeA)

    for i in range(len(C)):
        if C[i] != 1:
            for j in range(i+1,len(C)):
                if A[i][0]==A[j][0] and A[i][1]==A[j][1] and C[j]!= 1:
                    x = B[i][0]+B[j][0]
                    y = B[i][1]+B[j][1]
                    C[i],C[j] = 1,1

                    if abs(y)/abs(x) > D:
                        if y > 0: NEW_ARRAY[i]=([A[i][0],A[i][1],y])

    return NEW_ARRAY

除其他事项外,我分裂previous阵列b的成两个不同的输入数组的B和C,因为B中的每一行包含2浮子元件和只是充当一个整数旗。所以我删除了标志整数并把它们放在一个单独的1-D阵列,C。因此,投入现在看来很喜欢这个:

Among other things, I split the previous array "b" into two different input arrays "B" and "C", because each row of "b" contained 2 float elements and an integer that just acted as a flag. So I removed the flag integers and put them in a separate 1-D array, "C". So, the inputs now looked liked this:

A =[[12.7, 13.5, 1.0],[23.4, 43.1, 1.0],...]
B =[[0.46,0.95],[4.56,0.92],...]
C =[0,0,...]
D = 0.1

在理想情况下,这应该更快去与现在所有的变量正在键入(?)......但很明显,我做得非常错误的,因为函数现在进来在35.3s时......更糟糕的方式比纯Python!!

Ideally, this should go much faster with all the variables now being typed(?)...but obviously I'm doing something very wrong, because the function now comes in at a time of 35.3s...way WORSE than the "pure Python"!!

什么是我搞坏这么惨?感谢您的阅读!

What am I botching so badly? Thanks for reading!

推荐答案

我认为使用索引符号的 B [J] [0] 可关闭投掷用Cython ,使它不可能为它使用的快速索引操作在幕后。顺便说一句,即使在纯Python code这种风格是不是习惯,可能导致更慢code。

I believe the use of the indexing notation b[j][0] may be throwing Cython off, making it impossible for it to use fast indexing operations behind the scenes. Incidentally, even in pure Python code this style is not idiomatic and may lead to slower code.

会转而使用符号 B [J,0] 遍布,看看它是否提高你的表现。

Try instead to use the notation b[j,0] throughout and see if it improves your performance.

这篇关于我已经错位用Cython厉害,它的表现比纯Python差。为什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆