为什么从串联列表创建集合比使用`.update`更快? [英] Why is creating a set from a concatenated list faster than using `.update`?
问题描述
While trying to answer What is the preferred way to compose a set from multiple lists in Python, I did some performance analysis and came up with a somewhat surprising conclusion.
使用
python -m timeit -s '
import itertools
import random
n=1000000
random.seed(0)
A = [random.randrange(1<<30) for _ in xrange(n)]
B = [random.randrange(1<<30) for _ in xrange(n)]
C = [random.randrange(1<<30) for _ in xrange(n)]'
对于设置,我定时设置了以下片段:
for setup, I timed the following snippets:
> $TIMEIT 'set(A+B+C)'
10 loops, best of 3: 872 msec per loop
> $TIMEIT 's = set(A); s.update(B); s.update(C)'
10 loops, best of 3: 930 msec per loop
> $TIMEIT 's = set(itertools.chain(A,B,C))'
10 loops, best of 3: 941 msec per loop
令我惊讶的是,尽管set(A+B+C)
创建了一个包含3000000个元素的中间列表,但它却是最快的. .update
和itertools.chain
都较慢,即使它们都不复制任何列表.
To my surprise, set(A+B+C)
is the fastest despite the fact that it creates an intermediate list containing 3000000 elements. .update
and itertools.chain
are both slower, even though neither of them copy any lists.
这是怎么回事?
在第二台计算机(OS X 10.10.5,Python 2.7.10、2.5GHz Core i7)上,我运行了以下脚本(该脚本向前和向后运行测试以避免顺序影响):
On a second machine (OS X 10.10.5, Python 2.7.10, 2.5GHz Core i7), I ran the following script (which runs the tests forwards and backwards to avoid ordering effects):
SETUP='import itertools
import random
n=1000000
random.seed(0)
A = [random.randrange(1<<30) for _ in xrange(n)]
B = [random.randrange(1<<30) for _ in xrange(n)]
C = [random.randrange(1<<30) for _ in xrange(n)]'
python -m timeit -s "$SETUP" 'set(A+B+C)'
python -m timeit -s "$SETUP" 's = set(A); s.update(B); s.update(C)'
python -m timeit -s "$SETUP" 's = set(itertools.chain(A,B,C))'
python -m timeit -s "$SETUP" 's = set(itertools.chain(A,B,C))'
python -m timeit -s "$SETUP" 's = set(A); s.update(B); s.update(C)'
python -m timeit -s "$SETUP" 'set(A+B+C)'
并获得以下结果:
10 loops, best of 3: 579 msec per loop
10 loops, best of 3: 726 msec per loop
10 loops, best of 3: 775 msec per loop
10 loops, best of 3: 761 msec per loop
10 loops, best of 3: 737 msec per loop
10 loops, best of 3: 555 msec per loop
现在set(A+B+C)
明显更快 ,并且结果非常稳定-很难将其归因于单纯的测量误差.重复运行此脚本会产生相似的结果.
Now set(A+B+C)
is clearly faster, and the results are quite stable - it is hard to chalk this up to mere measurement error. Running this script repeatedly produces similar results.
推荐答案
在带有类似2.7.10的Python 2.7.10处理器的Win 7 SP1机器上,我得到的结果与您得出的结果毫不奇怪.是人们可能期望的最慢的方式.重新启用垃圾收集和使用Python 3.4.3可获得类似的结果.
I get different, non-surprising, results than you on my Win 7 SP1 box with a similar processor with Python 2.7.10, where set(A+B+C)
appears to be the slowest way to do it as one might expect. Similar results were obtained with garbage collection re-enabled and with Python 3.4.3.
我基于timeit
使用了自己的性能评估测试平台,并获得了以下结果:
I used my own performance assessing testbed based on timeit
and got the following results:
fastest to slowest execution speeds (Python 2.7.10)
(10 executions, best of 3 repetitions)
set(A); s.update(B); s.update(C) : 4.787919 secs, rel speed 1.00x, 0.00% slower
set(A).update(B,C) : 6.463666 secs, rel speed 1.35x, 35.00% slower
set(itertools.chain(A,B,C)) : 6.743028 secs, rel speed 1.41x, 40.83% slower
set(A+B+C) : 8.030483 secs, rel speed 1.68x, 67.72% slower
基准代码:
from __future__ import print_function
import sys
from textwrap import dedent
import timeit
N = 10 # Number of executions of each "algorithm"
R = 3 # number of Repeations of executions
# common setup for all algorithms (not timed)
setup = dedent("""
import itertools
import gc
import random
try:
xrange
except NameError:
xrange = range
random.seed(0)
n = 1000000 # number of elements in each list
A = [random.randrange(1<<30) for _ in xrange(n)]
B = [random.randrange(1<<30) for _ in xrange(n)]
C = [random.randrange(1<<30) for _ in xrange(n)]
# gc.enable() # to (re)enable garbage collection if desired
""")
algorithms = {
"set(A+B+C)": dedent("""
s = set(A+B+C)
"""),
"set(A); s.update(B); s.update(C)": dedent("""
s = set(A); s.update(B); s.update(C)
"""),
"set(itertools.chain(A,B,C))": dedent("""
s = set(itertools.chain(A,B,C))
"""),
"set(A).update(B,C)": dedent("""
s = set(A).update(B,C)
"""),
}
# execute and time algorithms, collecting results
timings = [
(label,
min(timeit.repeat(algorithms[label], setup=setup, repeat=R, number=N)),
) for label in algorithms
]
print('fastest to slowest execution speeds (Python {}.{}.{})\n'.format(
*sys.version_info[:3]),
' ({:,d} executions, best of {:d} repetitions)\n'.format(N, R))
longest = max(len(timing[0]) for timing in timings) # length of longest label
ranked = sorted(timings, key=lambda t: t[1]) # ascending sort by execution time
fastest = ranked[0][1]
for timing in ranked:
print("{:>{width}} : {:9.6f} secs, rel speed {:4.2f}x, {:6.2f}% slower".
format(timing[0], timing[1], round(timing[1]/fastest, 2),
round((timing[1]/fastest - 1) * 100, 2), width=longest))
这篇关于为什么从串联列表创建集合比使用`.update`更快?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!