Cython字符串连接超级慢;它还有什么不好的表现? [英] Cython string concatenation is super slow; what else does it do poorly?

查看:93
本文介绍了Cython字符串连接超级慢;它还有什么不好的表现?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个庞大的Python代码库,我们最近开始使用Cython进行编译。在不对代码进行任何更改的情况下,我希望性能保持不变,但是我们计划在进行性能分析后,使用Cython特定代码优化较重的计算。但是,已编译应用程序的速度急剧下降,并且似乎是全面的。方法花费的时间比以前多了10%到300%。



我一直在使用测试代码来尝试找出Cython做得不好的事情,而且看起来字符串操作就是其中之一。我的问题是,我做错了什么还是Cython在某些方面真的很糟糕?您能帮助我理解为什么如此糟糕以及Cython可能做得很差吗?



编辑:让我尝试澄清一下。我意识到这种类型的字符串连接非常糟糕;我只是注意到它的速度差异很大,所以我发布了它(可能是个坏主意)。该代码库没有这种糟糕的代码,但仍然大大降低了速度,我希望找到有关Cython处理不佳的哪种类型的构造的指针,以便我弄清楚该去哪里找。我已经尝试了性能分析,但是它并不是特别有用。



作为参考,这是我的字符串操作测试代码。我意识到下面的代码很糟糕并且没有用,但是我仍然为速度差异感到震惊。

 #pyCode.py 
def str1():
val =
for i in xrange(100000):
val = str(i)

def str2() :
val =
for i in xrange(100000):
val + ='a'

def str3():
val = xrange(100000)中i的

val + = str(i)

计时代码

 #compare.py 
导入时间it

pyTimes = {}
cyTimes = {}

#STR1
number = 10

setup = import pyCode
stmt = pyCode。 str1()
pyTimes ['str1'] = timeit.timeit(stmt = stmt,setup = setup,number = number)

setup = import cyCode
stmt = cyCode.str1()
cyTimes ['str1'] = timeit.timeit(stmt = stmt,setup = setup,number = number)

#STR2
设置= import pyCode
stmt = pyCode.str2()
pyTimes ['str2'] = timeit.timei t(stmt = stmt,setup = setup,number = number)

setup = import cyCode
stmt = cyCode.str2()
cyTimes ['str2' ] = timeit.timeit(stmt = stmt,setup = setup,number = number)

#STR3
setup = import pyCode
stmt = pyCode.str3()
pyTimes ['str3'] = timeit.timeit(stmt = stmt,setup = setup,number = number)

setup = import cyCode
stmt = cyCode .str3()
cyTimes ['str3'] = timeit.timeit(stmt = stmt,setup = setup,number = number)

for funcName in sorted(pyTimes.viewkeys() ):
打印 PY {}接受{} s .format(funcName,pyTimes [funcName])
打印 CY {}接受{} s .format(funcName,cyTimes [funcName])

使用



<$ p $编译Cython模块p> cp pyCode.py cyCode.py
cython cyCode.py
gcc -O2 -fPIC -shared -I $ PYTHONHOME / include / python2.7 \
-fno-strict-aliasing -fno-strict-overflow -o cyCode.so cyCode.c

结果时间

 > python compare.py 
PY str1占用0.1610019207207s
CY str1占用0.104282140732s
PY str2占用0.0739600658417s
CY str2占用2.34380102158s
PY str3占用0.224936962128s
CY str3用了21.6859859738s

作为参考,我已经用Cython 0.19.1尝试过和0.23.4。我已经使用gcc 4.8.2和icc 14.0.2编译了C代码,并同时尝试了各种标志。

解决方案

值得阅读:Pep 0008>编程建议:


编写代码的方式不应损害其他Python实现(PyPy,Jython,



例如,不要依赖CPython有效地实现就地字符串连接的形式为+ = b的语句或a = a + b。即使在CPython中,这种优化也很脆弱(仅适用于某些类型),并且在不使用引用计数的实现中根本没有这种优化。在库的性能敏感部分中,应使用’.join()形式。这将确保在各种实现中串联发生在线性时间内。


参考: https://www.python.org/dev/peps/pep-0008/#programming-recommendations


I have a large Python code base which we recently started compiling with Cython. Without making any changes to the code, I expected performance to stay about the same, but we planned to optimize heavier computations with Cython specific code after profiling. However, the speed of the compiled application plummeted and it appears to be across the board. Methods are taking anywhere from 10% to 300% longer than before.

I've been playing around with test code to try and find things Cython does poorly and it appears that string manipulation is one of them. My question is, am I doing something wrong or is Cython really just bad at some things? Can you help me understand why this is so bad and what else Cython might do very poorly?

EDIT: Let me try to clarify. I realize that this type of string concatenation is very bad; I just noticed it has a huge speed difference so I posted it (probably a bad idea). The codebase doesn't have this type of terrible code but has still slowed dramatically and I'm hoping for pointers on what type of constructs Cython handles poorly so I can figure out where to look. I've tried profiling but it was not particularly helpful.

For reference, here is my string manipulation test code. I realize the code below is terrible and useless, but I'm still shocked by the speed difference.

# pyCode.py
def str1():
    val = ""
    for i in xrange(100000):
        val = str(i)

def str2():
    val = ""
    for i in xrange(100000):
        val += 'a'

def str3():
    val = ""
    for i in xrange(100000):
        val += str(i)

Timing code

# compare.py
import timeit

pyTimes = {}
cyTimes = {}

# STR1
number=10

setup = "import pyCode"
stmt = "pyCode.str1()"
pyTimes['str1'] = timeit.timeit(stmt=stmt, setup=setup, number=number)

setup = "import cyCode"
stmt = "cyCode.str1()"
cyTimes['str1'] = timeit.timeit(stmt=stmt, setup=setup, number=number)

# STR2
setup = "import pyCode"
stmt = "pyCode.str2()"
pyTimes['str2'] = timeit.timeit(stmt=stmt, setup=setup, number=number)

setup = "import cyCode"
stmt = "cyCode.str2()"
cyTimes['str2'] = timeit.timeit(stmt=stmt, setup=setup, number=number)

# STR3
setup = "import pyCode"
stmt = "pyCode.str3()"
pyTimes['str3'] = timeit.timeit(stmt=stmt, setup=setup, number=number)

setup = "import cyCode"
stmt = "cyCode.str3()"
cyTimes['str3'] = timeit.timeit(stmt=stmt, setup=setup, number=number)

for funcName in sorted(pyTimes.viewkeys()):
    print "PY {} took {}s".format(funcName, pyTimes[funcName])
    print "CY {} took {}s".format(funcName, cyTimes[funcName])

Compiling a Cython module with

cp pyCode.py cyCode.py
cython cyCode.py
gcc -O2 -fPIC -shared -I$PYTHONHOME/include/python2.7 \
    -fno-strict-aliasing -fno-strict-overflow -o cyCode.so cyCode.c

Resulting timings

> python compare.py 
PY str1 took 0.1610019207s
CY str1 took 0.104282140732s
PY str2 took 0.0739600658417s
CY str2 took 2.34380102158s
PY str3 took 0.224936962128s
CY str3 took 21.6859738827s

For reference, I've tried this with Cython 0.19.1 and 0.23.4. I've compiled the C code with gcc 4.8.2 and icc 14.0.2, trying various flags with both.

解决方案

Worth reading: Pep 0008 > Programming Recommendations:

Code should be written in a way that does not disadvantage other implementations of Python (PyPy, Jython, IronPython, Cython, Psyco, and such).

For example, do not rely on CPython's efficient implementation of in-place string concatenation for statements in the form a += b or a = a + b . This optimization is fragile even in CPython (it only works for some types) and isn't present at all in implementations that don't use refcounting. In performance sensitive parts of the library, the ''.join() form should be used instead. This will ensure that concatenation occurs in linear time across various implementations.

Reference: https://www.python.org/dev/peps/pep-0008/#programming-recommendations

这篇关于Cython字符串连接超级慢;它还有什么不好的表现?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆