为什么变量1 + =变量2比变量1 =变量1 +变量2快得多? [英] Why is variable1 += variable2 much faster than variable1 = variable1 + variable2?
问题描述
我继承了一些Python代码,用于创建巨大的表格(最多由19列到5000行)。表格在屏幕上绘制需要花费9秒以上的时间。我注意到每一行都是用这段代码添加的:
sTable = sTable +'\\\
'+ GetRow()
其中 sTable
是一个字符串。
我将其改为:
sTable + ='\\\
'+ GetRow ()
我注意到表格现在出现在 6秒中。
然后我将它改为:
sTable + =' \%s'%GetRow()
基于这些Python性能提示(仍然是6秒钟)。
由于这被称为约5000次,它强调了性能问题。但为什么会有这么大的差异?为什么没有编译器在第一个版本中发现问题并对其进行优化?
这不是关于使用inplace + =
与 +
二进制加法。你没有告诉我们整个故事。您的原始版本连接了3个字符串,而不仅仅是两个:
sTable = sTable +'\\\
'+ sRow#simplified,sRow是一个函数调用
Python试图帮助并优化字符串连接;在使用 strobj + = otherstrobj
和 strobj = strobj + otherstringobj
时都是如此,但当超过2字符串参与。
Python字符串是不可变的通常,但是如果没有其他引用到左手字符串对象和无论如何它都会被反弹,然后Python作弊和突变字符串。这样可以避免每次连接时都必须创建一个新字符串,这可以大大提高速度。
这是在字节码评估循环中实现的。在两个字符串上使用 BINARY_ADD
时和使用 INPLACE_ADD
在两个字符串上,Python委托连接到一个特殊的帮助函数 string_concatenate()
。为了能够通过变异字符串来优化连接,首先需要确保该字符串没有其他引用;如果只有堆栈和原始变量引用它,那么这可以完成,和 next 操作将替换原始变量引用。
因此,如果只有2个引用到该字符串,并且下一个运算符是 STORE_FAST
(设置局部变量)之一,则 STORE_DEREF
(设置由函数关闭引用的变量)或 STORE_NAME
(设置全局变量),受影响的变量当前引用同样的字符串,那么这个目标变量被清除,以减少引用的数量只是1,堆栈。
这就是为什么你的原始代码不能完全使用这个优化。表达式的第一部分是 sTable +'\ n'
,而 next 操作是另一个 BINARY_ADD
:
>>> import dis
>>> disdis(compile(rsTable = sTable +'\\\
'+ sRow','< stdin>','exec'))
1 0 LOAD_NAME 0(sTable)
3 LOAD_CONST 0('\ n')
6 BINARY_ADD
7 LOAD_NAME 1(sRow)
10 BINARY_ADD
11 STORE_NAME 0(sTable)
14 LOAD_CONST 1(无)
17 RETURN_VALUE
第一个 BINARY_ADD
后面跟着 LOAD_NAME
来访问 sRow
变量,而不是存储操作。第一个 BINARY_ADD
必须总是产生一个新的字符串对象,它的大小随着 sTable
的增长而变大,并且需要越来越多的时间创建这个新的字符串对象。
您将此代码更改为:
sTable + ='\\\
%s'%sRow
其中删除了第二个级联即可。现在字节码是:
>>> disdis(compile(rsTable + ='\\\
%s'%sRow,'< stdin>','exec'))
1 0 LOAD_NAME 0(sTable)
3 LOAD_CONST 0('\ n%s')
6 LOAD_NAME 1(sRow)
9 BINARY_MODULO
10 INPLACE_ADD
11 STORE_NAME 0(sTable)
14 LOAD_CONST 1 (无)
17 RETURN_VALUE
我们剩下的是一个 INPLACE_ADD
,然后是商店。现在 sTable
可以在原地进行更改,而不会导致一个更大的新字符串对象。
已经得到了与以下相同的速度差:
sTable = sTable +('\\\
%s'%sRow)
在这里。
时间试算显示不同之处:
>>> import random
>>> from timeit import timeit
>>> testlist = [''.join([chr(random.randint(48,127))for _ in range(random.randrange(10,30))])for _ in range(1000)]
> >> def str_threevalue_concat(lst):
... res =''
... for elem in lst:
... res = res +'\ n'+ elem
...
>>> def str_twovalue_concat(lst):
... res =''
... for elem in lst:
... res = res +('\\\
%s'%elem )
...
>>> timeit('f(l)','from __main__ import testlist as l,str_threevalue_concat as f',number = 10000)
6.196403980255127
>>> timeit('f(l)','from __main__ import testlist as l,str_twovalue_concat as f',number = 10000)
2.3599119186401367
这个故事的寓意是你不应该首先使用字符串连接。从其他字符串中加载新字符串的正确方法是使用列表,然后使用 str.join()
:
table_rows = []
用于something_else中的某些内容:
table_rows + = ['\ n',GetRow()]
sTable =''.join(table_rows)
这还是更快的:
>>> def str_join_concat(lst):
... res =''.join(['\\\
%s'%elem for elem in lst])
...
>> ;> timeit('f(l)','from __main__ import testlist as l,str_join_concat as f',number = 10000)
1.7978830337524414
但你不能使用'\\\
:
'.join(lst)
>>> timeit('f(l)','from __main__ import testlist as l,nl_join_concat as f',number = 10000)
0.23735499382019043
I have inherited some Python code which is used to create huge tables (of up to 19 columns wide by 5000 rows). It took nine seconds for the table to be drawn on the screen. I noticed that each row was added using this code:
sTable = sTable + '\n' + GetRow()
where sTable
is a string.
I changed that to:
sTable += '\n' + GetRow()
and I noticed that the table now appeared in six seconds.
And then I changed it to:
sTable += '\n%s' % GetRow()
based on these Python performance tips (still six seconds).
Since this was called about 5000 times, it highlighted the performance issue. But why was there such a large difference? And why didn't the compiler spot the problem in the first version and optimise it?
This isn't about using inplace +=
versus +
binary add. You didn't tell us the whole story. Your original version concatenated 3 strings, not just two:
sTable = sTable + '\n' + sRow # simplified, sRow is a function call
Python tries to help out and optimises string concatenation; both when using strobj += otherstrobj
and strobj = strobj + otherstringobj
, but it cannot apply this optimisation when more than 2 strings are involved.
Python strings are immutable normally, but if there are no other references to the left-hand string object and it is being rebound anyway, then Python cheats and mutates the string. This avoids having to create a new string each time you concatenate, and that can lead to a big speed improvement.
This is implemented in the bytecode evaluation loop. Both when using BINARY_ADD
on two strings and when using INPLACE_ADD
on two strings, Python delegates concatenation to a special helper function string_concatenate()
. To be able to optimize the concatenation by mutating the string, it first needs to make sure that the string has no other references to it; if only the stack and the original variable reference it then this can be done, and the next operation is going to replace the original variable reference.
So if there are just 2 references to the string, and the next operator is one of STORE_FAST
(set a local variable), STORE_DEREF
(set a variable referenced by closed over functions) or STORE_NAME
(set a global variable), and the affected variable currently references the same string, then that target variable is cleared to reduce the number of references to just 1, the stack.
And this is why your original code could not use this optimization fully. The first part of your expression is sTable + '\n'
and the next operation is another BINARY_ADD
:
>>> import dis
>>> dis.dis(compile(r"sTable = sTable + '\n' + sRow", '<stdin>', 'exec'))
1 0 LOAD_NAME 0 (sTable)
3 LOAD_CONST 0 ('\n')
6 BINARY_ADD
7 LOAD_NAME 1 (sRow)
10 BINARY_ADD
11 STORE_NAME 0 (sTable)
14 LOAD_CONST 1 (None)
17 RETURN_VALUE
The first BINARY_ADD
is followed by a LOAD_NAME
to access the sRow
variable, not a store operation. This first BINARY_ADD
must always result in a new string object, ever larger as sTable
grows and it takes more and more time to create this new string object.
You changed this code to:
sTable += '\n%s' % sRow
which removed the second concatenation. Now the bytecode is:
>>> dis.dis(compile(r"sTable += '\n%s' % sRow", '<stdin>', 'exec'))
1 0 LOAD_NAME 0 (sTable)
3 LOAD_CONST 0 ('\n%s')
6 LOAD_NAME 1 (sRow)
9 BINARY_MODULO
10 INPLACE_ADD
11 STORE_NAME 0 (sTable)
14 LOAD_CONST 1 (None)
17 RETURN_VALUE
and all we have left is an INPLACE_ADD
followed by a store. Now sTable
can be altered in-place, not resulting in a ever larger new string object.
You'd have gotten the same speed difference with:
sTable = sTable + ('\n%s' % sRow)
here.
A time trial shows the difference:
>>> import random
>>> from timeit import timeit
>>> testlist = [''.join([chr(random.randint(48, 127)) for _ in range(random.randrange(10, 30))]) for _ in range(1000)]
>>> def str_threevalue_concat(lst):
... res = ''
... for elem in lst:
... res = res + '\n' + elem
...
>>> def str_twovalue_concat(lst):
... res = ''
... for elem in lst:
... res = res + ('\n%s' % elem)
...
>>> timeit('f(l)', 'from __main__ import testlist as l, str_threevalue_concat as f', number=10000)
6.196403980255127
>>> timeit('f(l)', 'from __main__ import testlist as l, str_twovalue_concat as f', number=10000)
2.3599119186401367
The moral of this story is that you should not be using string concatenation in the first place. The proper way to build a new string from loads of other strings is to use a list, then use str.join()
:
table_rows = []
for something in something_else:
table_rows += ['\n', GetRow()]
sTable = ''.join(table_rows)
This is faster still:
>>> def str_join_concat(lst):
... res = ''.join(['\n%s' % elem for elem in lst])
...
>>> timeit('f(l)', 'from __main__ import testlist as l, str_join_concat as f', number=10000)
1.7978830337524414
but you cannot beat using just '\n'.join(lst)
:
>>> timeit('f(l)', 'from __main__ import testlist as l, nl_join_concat as f', number=10000)
0.23735499382019043
这篇关于为什么变量1 + =变量2比变量1 =变量1 +变量2快得多?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!