为什么变量1 + =变量2比变量1 =变量1 +变量2快得多? [英] Why is variable1 += variable2 much faster than variable1 = variable1 + variable2?

查看:138
本文介绍了为什么变量1 + =变量2比变量1 =变量1 +变量2快得多?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我继承了一些Python代码,用于创建巨大的表格(最多由19列到5000行)。表格在屏幕上绘制需要花费9秒以上的时间。我注意到每一行都是用这段代码添加的:

  sTable = sTable +'\\\
'+ GetRow()

其中 sTable 是一个字符串。



我将其改为:

  sTable + ='\\\
'+ GetRow ()

我注意到表格现在出现在 6秒中。



然后我将它改为:

  sTable + =' \%s'%GetRow()

基于这些Python性能提示(仍然是6秒钟)。

由于这被称为约5000次,它强调了性能问题。但为什么会有这么大的差异?为什么没有编译器在第一个版本中发现问题并对其进行优化?

解决方案

这不是关于使用inplace + = + 二进制加法。你没有告诉我们整个故事。您的原始版本连接了3个字符串,而不仅仅是两个:

  sTable = sTable +'\\\
'+ sRow#simplified,sRow是一个函数调用

Python试图帮助并优化字符串连接;在使用 strobj + = otherstrobj strobj = strobj + otherstringobj 时都是如此,但当超过2字符串参与。



Python字符串是不可变的通常,但是如果没有其他引用到左手字符串对象无论如何它都会被反弹,然后Python作弊和突变字符串。这样可以避免每次连接时都必须创建一个新字符串,这可以大大提高速度。



这是在字节码评估循环中实现的。在两个字符串上使用 BINARY_ADD 和使用 INPLACE_ADD 在两个字符串上,Python委托连接到一个特殊的帮助函数 string_concatenate() 。为了能够通过变异字符串来优化连接,首先需要确保该字符串没有其他引用;如果只有堆栈和原始变量引用它,那么这可以完成, next 操作将替换原始变量引用。



因此,如果只有2个引用到该字符串,并且下一个运算符是 STORE_FAST (设置局部变量)之一,则 STORE_DEREF (设置由函数关闭引用的变量)或 STORE_NAME (设置全局变量),受影响的变量当前引用同样的字符串,那么这个目标变量被清除,以减少引用的数量只是1,堆栈。

这就是为什么你的原始代码不能完全使用这个优化。表达式的第一部分是 sTable +'\ n',而 next 操作是另一个 BINARY_ADD

 >>> import dis 
>>> disdis(compile(rsTable = sTable +'\\\
'+ sRow','< stdin>','exec'))
1 0 LOAD_NAME 0(sTable)
3 LOAD_CONST 0('\ n')
6 BINARY_ADD
7 LOAD_NAME 1(sRow)
10 BINARY_ADD
11 STORE_NAME 0(sTable)
14 LOAD_CONST 1(无)
17 RETURN_VALUE

第一个 BINARY_ADD 后面跟着 LOAD_NAME 来访问 sRow 变量,而不是存储操作。第一个 BINARY_ADD 必须总是产生一个新的字符串对象,它的大小随着 sTable 的增长而变大,并且需要越来越多的时间创建这个新的字符串对象。



您将此代码更改为:

  sTable + ='\\\
%s'%sRow

其中删除了第二个级联即可。现在字节码是:

 >>> disdis(compile(rsTable + ='\\\
%s'%sRow,'< stdin>','exec'))
1 0 LOAD_NAME 0(sTable)
3 LOAD_CONST 0('\ n%s')
6 LOAD_NAME 1(sRow)
9 BINARY_MODULO
10 INPLACE_ADD
11 STORE_NAME 0(sTable)
14 LOAD_CONST 1 (无)
17 RETURN_VALUE

我们剩下的是一个 INPLACE_ADD ,然后是商店。现在 sTable 可以在原地进行更改,而不会导致一个更大的新字符串对象。



已经得到了与以下相同的速度差:

  sTable = sTable +('\\\
%s'%sRow)

在这里。

时间试算显示不同之处:

 >>> import random 
>>> from timeit import timeit
>>> testlist = [''.join([chr(random.randint(48,127))for _ in range(random.randrange(10,30))])for _ in range(1000)]
> >> def str_threevalue_concat(lst):
... res =''
... for elem in lst:
... res = res +'\ n'+ elem
...
>>> def str_twovalue_concat(lst):
... res =''
... for elem in lst:
... res = res +('\\\
%s'%elem )
...
>>> timeit('f(l)','from __main__ import testlist as l,str_threevalue_concat as f',number = 10000)
6.196403980255127
>>> timeit('f(l)','from __main__ import testlist as l,str_twovalue_concat as f',number = 10000)
2.3599119186401367

这个故事的寓意是你不应该首先使用字符串连接。从其他字符串中加载新字符串的正确方法是使用列表,然后使用 str.join()

  table_rows = [] 
用于something_else中的某些内容:
table_rows + = ['\ n',GetRow()]
sTable =''.join(table_rows)

这还是更快的:

 >>> def str_join_concat(lst):
... res =''.join(['\\\
%s'%elem for elem in lst])
...
>> ;> timeit('f(l)','from __main__ import testlist as l,str_join_concat as f',number = 10000)
1.7978830337524414

但你不能使用'\\\
'.join(lst)


 >>> timeit('f(l)','from __main__ import testlist as l,nl_join_concat as f',number = 10000)
0.23735499382019043


I have inherited some Python code which is used to create huge tables (of up to 19 columns wide by 5000 rows). It took nine seconds for the table to be drawn on the screen. I noticed that each row was added using this code:

sTable = sTable + '\n' + GetRow()

where sTable is a string.

I changed that to:

sTable += '\n' + GetRow()

and I noticed that the table now appeared in six seconds.

And then I changed it to:

sTable += '\n%s' % GetRow()

based on these Python performance tips (still six seconds).

Since this was called about 5000 times, it highlighted the performance issue. But why was there such a large difference? And why didn't the compiler spot the problem in the first version and optimise it?

解决方案

This isn't about using inplace += versus + binary add. You didn't tell us the whole story. Your original version concatenated 3 strings, not just two:

sTable = sTable + '\n' + sRow  # simplified, sRow is a function call

Python tries to help out and optimises string concatenation; both when using strobj += otherstrobj and strobj = strobj + otherstringobj, but it cannot apply this optimisation when more than 2 strings are involved.

Python strings are immutable normally, but if there are no other references to the left-hand string object and it is being rebound anyway, then Python cheats and mutates the string. This avoids having to create a new string each time you concatenate, and that can lead to a big speed improvement.

This is implemented in the bytecode evaluation loop. Both when using BINARY_ADD on two strings and when using INPLACE_ADD on two strings, Python delegates concatenation to a special helper function string_concatenate(). To be able to optimize the concatenation by mutating the string, it first needs to make sure that the string has no other references to it; if only the stack and the original variable reference it then this can be done, and the next operation is going to replace the original variable reference.

So if there are just 2 references to the string, and the next operator is one of STORE_FAST (set a local variable), STORE_DEREF (set a variable referenced by closed over functions) or STORE_NAME (set a global variable), and the affected variable currently references the same string, then that target variable is cleared to reduce the number of references to just 1, the stack.

And this is why your original code could not use this optimization fully. The first part of your expression is sTable + '\n' and the next operation is another BINARY_ADD:

>>> import dis
>>> dis.dis(compile(r"sTable = sTable + '\n' + sRow", '<stdin>', 'exec'))
  1           0 LOAD_NAME                0 (sTable)
              3 LOAD_CONST               0 ('\n')
              6 BINARY_ADD          
              7 LOAD_NAME                1 (sRow)
             10 BINARY_ADD          
             11 STORE_NAME               0 (sTable)
             14 LOAD_CONST               1 (None)
             17 RETURN_VALUE        

The first BINARY_ADD is followed by a LOAD_NAME to access the sRow variable, not a store operation. This first BINARY_ADD must always result in a new string object, ever larger as sTable grows and it takes more and more time to create this new string object.

You changed this code to:

sTable += '\n%s' % sRow

which removed the second concatenation. Now the bytecode is:

>>> dis.dis(compile(r"sTable += '\n%s' % sRow", '<stdin>', 'exec'))
  1           0 LOAD_NAME                0 (sTable)
              3 LOAD_CONST               0 ('\n%s')
              6 LOAD_NAME                1 (sRow)
              9 BINARY_MODULO       
             10 INPLACE_ADD         
             11 STORE_NAME               0 (sTable)
             14 LOAD_CONST               1 (None)
             17 RETURN_VALUE        

and all we have left is an INPLACE_ADD followed by a store. Now sTable can be altered in-place, not resulting in a ever larger new string object.

You'd have gotten the same speed difference with:

sTable = sTable + ('\n%s' % sRow)

here.

A time trial shows the difference:

>>> import random
>>> from timeit import timeit
>>> testlist = [''.join([chr(random.randint(48, 127)) for _ in range(random.randrange(10, 30))]) for _ in range(1000)]
>>> def str_threevalue_concat(lst):
...     res = ''
...     for elem in lst:
...         res = res + '\n' + elem
... 
>>> def str_twovalue_concat(lst):
...     res = ''
...     for elem in lst:
...         res = res + ('\n%s' % elem)
... 
>>> timeit('f(l)', 'from __main__ import testlist as l, str_threevalue_concat as f', number=10000)
6.196403980255127
>>> timeit('f(l)', 'from __main__ import testlist as l, str_twovalue_concat as f', number=10000)
2.3599119186401367

The moral of this story is that you should not be using string concatenation in the first place. The proper way to build a new string from loads of other strings is to use a list, then use str.join():

table_rows = []
for something in something_else:
    table_rows += ['\n', GetRow()]
sTable = ''.join(table_rows)

This is faster still:

>>> def str_join_concat(lst):
...     res = ''.join(['\n%s' % elem for elem in lst])
... 
>>> timeit('f(l)', 'from __main__ import testlist as l, str_join_concat as f', number=10000)
1.7978830337524414

but you cannot beat using just '\n'.join(lst):

>>> timeit('f(l)', 'from __main__ import testlist as l, nl_join_concat as f', number=10000)
0.23735499382019043

这篇关于为什么变量1 + =变量2比变量1 =变量1 +变量2快得多?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆