字符串性能 - Windows 10 与 Ubuntu 下的 Python 2.7 与 Python 3.4 [英] String performance - Python 2.7 vs Python 3.4 under Windows 10 vs. Ubuntu

查看:91
本文介绍了字符串性能 - Windows 10 与 Ubuntu 下的 Python 2.7 与 Python 3.4的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

用例
一个简单的函数,用于检查特定字符串是否位于另一个字符串中的 3 的倍数位置(请参阅此处了解 真实世界示例,在 DNA 序列中找到终止密码子).

功能
sliding_window:取一个长度为 3 的字符串与搜索字符串进行比较,如果它们相同则向前移动 3 个字符.
incremental_start:尝试查找搜索字符串,如果找到的位置不是3的倍数,则尝试找到找到的位置之后的下一个位置.

Use case
A simple function which checks if a specific string is in another string at a position which is a multiple of 3 (see here for a real world example, finding stop codons in a DNA sequence).

Functions
sliding_window: takes a string of length 3 compares it to the search string, if they are identical moves 3 characters forward.
incremental_start: tries to find the search string, if the found position is not a multiple of 3, it tries to find the next position after the found position.

请注意:示例数据只是为了确保每个函数都经过完整的字符串,性能与真实数据或随机数据相似.

Please note: The sample data is just to make sure that each function has to go through the complete string, the performance is similar with real data or random data.

结果

  • Python 2.7:通过使用 Python2.7 中的 incremental_start 函数,初始 sliding_window 函数可以提高约 39 倍Windows 10.在 Ubuntu、~34x、~37x、~18x(VM、AWS、本机)上的性能提升略有下降,但仍在相同范围内.
  • Python 3.4:sliding_window 变得比 Python2.7 慢(Windows 上为 1.8 倍,所有 Ubuntu 上分别为 1.4 倍和 1.5 倍),但是 incremental_start 在所有 Ubuntu 上的性能下降了 4、5、1.7 倍(VM、AWS、本机),而在 Windows 上几乎没有变化.
  • Windows 与 Ubuntu
    Python2.7:虚拟化 Ubuntu 需要更少的时间来执行这两个功能(约 20-30%),原生 Ubuntu 慢约 25%对于 incremental_startsliding_window 快了 40%.
    Python3:sliding_window 函数需要更少的时间来完成(~50%),而 incremental_start 变慢了~2-3 倍.
  • Python 2.7: The initial sliding_window function could be improved by a factor of ~39 by using the function incremental_start in Python2.7 on Windows 10. There was slight drop in the performance improvement on Ubuntu, ~34x, ~37x, ~18x (VM, AWS, native), but still in the same range.
  • Python 3.4: sliding_window became slower than in Python2.7 (1.8x on Windows, 1.4x resp. 1.5x on all Ubuntus), but the incremental_start performance dropped on all Ubuntus by a factor of 4, 5, 1.7 (VM, AWS, native), while it hardly changed on Windows.
  • Windows vs Ubuntu
    Python2.7: virtualized Ubuntus needed less time for both functions(~20-30%), native Ubuntu was about 25% slower for the incremental_start, while sliding_window was 40% faster.
    Python3: the sliding_window function needed less time to finish (~50%), while the incremental_start became slower by a factor of ~2-3.

问题

  • 是什么导致了 Python 2 与 Python 3 在 Linux 与 Windows 上的性能差异?
  • 如何预测这种行为并调整代码以获得最佳性能?

代码

import timeit

text = 'ATG' * 10**6
word = 'TAG'

def sliding_window(text, word):
    for pos in range(0, len(text), 3):
        if text[pos:pos + 3] == word:
            return False
    return True

def incremental_start(text, word):
    start = 0
    while start != -1:
        start = text.find(word, start + 1)
        if start % 3 == 0:
            return False
    return True

#sliding window
time = timeit.Timer(lambda: sliding_window(text, word), setup='from __main__ import text, word').timeit(number=10)
print('%3.3f' % time)

#incremental start
time = timeit.Timer(lambda: incremental_start(text, word), setup='from __main__ import text, word').timeit(number=500)
print('%3.3f' % time)

表格

Ubuntu vs Windows    VM     AWS    Native   
Python2.7-Increment  79%    73%    126% 
Python2.7-Sliding    70%    70%    60%                  
Python3.4-Increment  307%   346%   201% 
Python3.4-Sliding    54%    59%    48%  

Py2 vs 3    Windows    VM    AWS    Native
Increment   105%       409%  501%   168%
Sliding     184%       143%  155%   147%

Absolute times in seconds
                 Win10   Ubuntu  AWS     Native
Py2.7-Increment  1.759   1.391   1.279   2.215 
Py2.7-Sliding    1.361   0.955   0.958   0.823 

Py3.4-Increment  1.853   5.692   6.406   3.722 
Py3.4-Sliding    2.507   1.365   1.482   1.214 

详情
Windows 10:本机 Windows、32 位 Python 3.4.3 或 2.7.9、i5-2500、16GB RAM
Ubuntu 虚拟机:14.04,在 Windows 主机上运行,​​64 位 Python 3.4.3,Python 2.7.6,4 核,4GB RAM
AWS:14.04,AWS 微型实例,64 位 Python 3.4.3,Python 2.7.6
本机 Ubuntu:14.04、64 位 Python 3.4.3、Python 2.7.6、i5-2500、16GB 内存 [与 Win10 机器相同]

Details
Windows 10: Native Windows, 32bit Python 3.4.3 or 2.7.9, i5-2500, 16GB RAM
Ubuntu virtual machine: 14.04, Running on the Windows host, 64bit Python 3.4.3, Python 2.7.6, 4 cores, 4GB RAM
AWS: 14.04, AWS micro instance, 64bit Python 3.4.3, Python 2.7.6
Native Ubuntu: 14.04, 64bit Python 3.4.3, Python 2.7.6, i5-2500, 16GB ram [identical to Win10 machine]

根据 Ingaz 的建议,使用了 xrangebytes,性能略有提高,但在使用 Python3.4 的 Ubuntu 上性能仍然大幅下降.罪魁祸首似乎是 find ,当 Ubuntu 和 Py3.4 结合使用时它会慢得多(与从源代码编译的 Py3.5 相同).这似乎与 Linux 风格有关,在 Debian Py2.7 和 Py3.4 上表现相同,在 RedHat Py2.7 上比 Py3.4 快得多.
为了更好地比较,Py3.4 现在在 Windows10 和 Ubuntu 上用于 64 位.在Win10上仍然使用Py27.

As suggested by Ingaz xrange and bytes were used, slight improvement in performance but still massive drop in performance on Ubuntu with Python3.4. The culprit seems to be find which is much slower when Ubuntu and Py3.4 are combined (same with Py3.5 which compiled was from source). This seems to Linux flavor dependent, on Debian Py2.7 and Py3.4 performed identical, on RedHat Py2.7 was considerably faster than Py3.4.
For better comparison Py3.4 is now used in 64bit on Windows10 and Ubuntu. Py27 is still used on Win10.

import timeit, sys

if sys.version_info >= (3,0):
    from builtins import range as xrange

def sliding_window(text, word):
    for pos in range(0, len(text), 3):
        if text[pos:pos + 3] == word:
            return False
    return True

def xsliding_window(text, word):
    for pos in xrange(0, len(text), 3):
        if text[pos:pos + 3] == word:
            return False
    return True

def incremental_start(text, word):
    start = 0
    while start != -1:
        start = text.find(word, start + 1)
        if start % 3 == 0:
            return False
    return True

text = 'aaa' * 10**6
word = 'aaA'
byte_text = b'aaa' * 10**6
byte_word = b'aaA'

time = timeit.Timer(lambda: sliding_window(text, word), setup='from __main__ import text, word').timeit(number=10)
print('Sliding, regular:      %3.3f' % time)

time = timeit.Timer(lambda: incremental_start(text, word), setup='from __main__ import text, word').timeit(number=500)
print('Incremental, regular:  %3.3f' % time)

time = timeit.Timer(lambda: sliding_window(byte_text, byte_word), setup='from __main__ import byte_text, byte_word').timeit(number=10)
print('Sliding, byte string:  %3.3f' % time)

time = timeit.Timer(lambda: incremental_start(byte_text, byte_word), setup='from __main__ import byte_text, byte_word').timeit(number=500)
print('Incremental, bytes:    %3.3f' % time)

time = timeit.Timer(lambda: xsliding_window(byte_text, byte_word), setup='from __main__ import byte_text, byte_word').timeit(number=10)
print('Sliding, xrange&bytes: %3.3f' % time)

time = timeit.Timer(lambda: text.find(word), setup='from __main__ import text, word').timeit(number=1000)
print('simple find in string: %3.3f' % time)


Win10-py27  Wi10-py35   VM-py27  VM-py34
1.440       2.674       0.993    1.368 
1.864       1.425       1.436    5.711 
1.439       2.388       1.048    1.219 
1.887       1.405       1.429    5.750 
1.332       2.356       0.772    1.224 
3.756       2.811       2.818    11.361 

推荐答案

尽管您测量的是相同代码的速度,但您的代码中的结构是不同的.

Although you are measuring speed of the same code, the structures in your code are different.

A.2.7中的rangetype 'list',3.4中的range是class 'range'

A. range in 2.7 is type 'list', range in 3.4 is class 'range'

B.'ATG' * 10**6 在 2.7 中是字节字符串,在 3.4 中是和 unicode 字符串

B. 'ATG' * 10**6 in 2.7 is a bytes string and in 3.4 it's and unicode string

在以下情况下,您可以尝试产生更兼容的结果:a) 对 2.7 变体使用 xrange,b) 在两个示例中都使用 bytes 字符串:b'ATG' 或 unicode两个示例中的字符串.

You can try to produce more compatible results if: a) use xrange for 2.7 variant, b) use bytes string in both examples: b'ATG' or unicode strings in both examples.

我怀疑性能差异源于以下主要因素:a) 32 位与 64 位,b) C 编译器.

I suspected that difference in performance stems from main factors: a) 32bit vs 64bit, b) C compiler.

所以,我做了以下测试:

So, I did tests for:

  1. ActiveState Python 2.7.10 32 位
  2. ActiveState Python 2.7.10 64 位
  3. 官方发行版 Python 2.7.11 32 位
  4. 官方发行版 Python 2.7.11 64 位
  5. Python 2.7.6 64 位 Windows 10 上的 Ubuntu
  6. pypy-5.1.1-win32

我所期待的

我预计:

  • 64 位版本会更慢
  • ActiveState 会快一点
  • PyPy 速度更快
  • Windows 10 上的 Ubuntu - ???
Test                    as32b   as64b   off32b   off64b  ubw64b  pypy5.1.1
Sliding, regular:       1.232   1.230   1.281    1.136   0.951   0.099  
Incremental, regular:   1.744   1.690   2.219    1.647   1.472   2.772
Sliding, byte string:   1.223   1.207   1.280    1.127   0.926   0.101
Incremental, bytes:     1.720   1.701   2.206    1.646   1.568   2.774
Sliding, xrange&bytes:  1.117   1.102   1.162    0.962   0.779   0.109
simple find in string:  3.443   3.412   4.607    3.300   2.487   0.289

而在 Windows 10 上的获胜者是......由 GCC 4.8.2 为 Linux 编译的 Ubuntu Python!

And the winner on Windows 10 is .... Ubuntu Python compiled by GCC 4.8.2 for Linux!

这个结果完全出乎我的意料.

This result was completely unexpected for me.

32 vs 64:变得无关紧要.

32 vs 64: turned irrelevant.

PyPy:一如既往的超快,除非它不是.

PyPy: as always megafast, except cases when it's not.

我无法解释这个结果,OP 问题并不像看起来那么简单.

I can't interprete this results, OP question turned not so simple as it seemed.

这篇关于字符串性能 - Windows 10 与 Ubuntu 下的 Python 2.7 与 Python 3.4的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆