Python字符串串联内部细节 [英] Python string concatenation internal details

查看:64
本文介绍了Python字符串串联内部细节的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我们有一个字符串列表,我们想通过串联此列表中的所有元素来创建一个字符串。像这样:

  def foo(str_lst):
结果=''
表示str_lst中的元素:
结果+ =元素
返回结果

由于字符串是不可变的对象,我希望python创建一个新的str对象,并在每次迭代时复制result和element的内容。它使 O(M * N ^ 2)的时间变得很复杂,M是每个元素的长度,N是列表的大小。



但是,我的实验表明它可以线性运行。

  N = 1000000#1百万
str_lst = ['a'for _ in range(N)]

foo(str_lst)#大约需要0.5秒

N = 2000000#200万
str_lst = ['a'for _ in range(N)]

foo(str_lst)#大约需要1.0秒

N = 10000000#1000万
str_lst = ['a'for _ in range(N)]

foo(str_lst)#大约需要5.3秒

我怀疑python使用了像stringbuffer之类的东西。因此,它不会在每次迭代时都创建新对象。



现在考虑一个稍微不同的实现。唯一的区别是一个额外的分配。

  def foo2(str_lst):
结果=''
对于str_lst中的元素:
结果+ =元素
temp =结果#新添加的行
返回结果

我知道 temp = result 行不会创建新对象。 temp 只是指向同一对象。因此,这个小变化不会对性能产生太大影响。

  N = 1000000#1百万
str_lst = [ 'a'for _ in range(N)]
foo(str_lst)#大约需要0.5秒
foo2(str_lst)#大约需要30秒

N = 2000000 #2百万
str_lst = ['_'为_ in range(N)]
foo(str_lst)#大约需要1秒
foo2(str_lst)#大约需要129秒

但是,两者之间存在巨大差异。看起来foo2函数是O(N ^ 2)而foo是O(N)。



我的问题是python如何在字符串连接中实现线性时间而不中断其他语言组件,如不可变对象分配?多余的线对性能有多大影响?我在cpython实现中进行了一些搜索,但找不到确切的位置。



更新



这是行概要分析结果。



foo函数的结果

 总时间:0.545577 s 
文件:< ; ipython-input-38-b9bb169e8fe0>
功能:第1行的foo

行号#每次命中的时间百分比时间行内容
================= ===========================================
1 def foo(str_lst):
2 1 2.0 2.0 0.0结果=''
3 1000001 238820.0 0.2 43.8 for str_lst中的元素:
4 1000000 306755.0 0.3 56.2结果+ =元素
5 1 0.0 0.0 0.0返回结果

foo2函数的结果

 总时间:30.6663 s 
文件:< ipython-input-40-34dd53670dd9>
功能:第1行的foo2
行号#每次命中的时间百分比时间线内容
===================== =======================================
1 def foo2(str_lst) :
2 1 2.0 2.0 0.0结果=''
3 1000001 299122.0 0.3 1.0对于str_lst中的元素:
4 1000000 30033127.0 30.0 97.7结果+ =元素
5 1000000 413283.0 0.4 1.3 temp =结果
6 1 0.0 0.0 0.0返回结果

Somehow temp = result 行影响 result + =元素行的性能。

解决方案

使用另一个指向同一对象的名称会破坏优化。优化基本上可以通过调整字符串对象的大小和附加到位。如果您对该对象有多个参考,则在不影响另一个参考的情况下就无法调整大小。由于字符串是不可变的,因此这将是实现的严重缺陷。

  temp =结果

增加了以 result 命名的字符串对象的引用计数,从而禁止了优化。 https://github.com/python/cpython/blob/8a1bab92915dd5c88832706c56af2f5611181d50/Objects/unicodeobject.c#L11318 rel = nofollow noreferrer> PyUnicode_Append )可以在 unicode_modifiable 函数。除其他事项外,它还会检查对象的引用计数是否等于1,没有被intern并且不是字符串子类。



if 声明保持这种优化,如果您想要更详尽的列表。






不过不是您问题的基本问题,未来的读者可能会对如何有效执行字符串连接感到好奇。除了关于SO的类似问题外, Python FAQ上也有一个条目


Assume that we have a list of strings and we want to create a string by concatenating all element in this list. Something like this:

def foo(str_lst):
    result = ''
    for element in str_lst:
        result += element
    return result

Since strings are immutable objects, I expect that python creates a new str object and copy contents of result and element at each iteration. It makes O(M * N^2) time complexity, M is the length of each element and N is the size of the list.

However, my experiment shows that it runs in linear time.

N = 1000000 # 1 million
str_lst = ['a' for _ in range(N)]

foo(str_lst) # It takes around 0.5 seconds

N = 2000000 # 2 million
str_lst = ['a' for _ in range(N)]

foo(str_lst) # It takes around 1.0 seconds

N = 10000000 # 10 million
str_lst = ['a' for _ in range(N)]

foo(str_lst) # It takes around 5.3 seconds

I suspect that python uses something like stringbuffer under the hood. So, it doesn't create new object at each iteration.

Now consider a slightly different implementation. The only difference is one extra assignment.

def foo2(str_lst):
    result = ''
    for element in str_lst:
        result += element
        temp = result # new added line
    return result

I know that temp = result line does not create a new object. temp just points to the same object. So, this little change shouldn't affect the performance much.

N = 1000000 # 1 million
str_lst = ['a' for _ in range(N)]
foo(str_lst) # It takes around 0.5 seconds
foo2(str_lst) # It takes around 30 seconds

N = 2000000 # 2 million
str_lst = ['a' for _ in range(N)]
foo(str_lst) # It takes around 1 seconds
foo2(str_lst) # It takes around 129 seconds

However, there is a huge difference. And it looks like foo2 function is O(N^2) while foo is O(N).

My question is how does python achieve linear time in string concatenation without breaking other language components like immutable object assignment? and how does that extra line affect the performance that much? I searched a bit in the cpython implementation but couldn't find exact location.

Update

Here is the line profiling results.

result for foo function

Total time: 0.545577 s
File: <ipython-input-38-b9bb169e8fe0>
Function: foo at line 1

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
 1                                           def foo(str_lst):
 2         1          2.0      2.0      0.0      result = ''
 3   1000001     238820.0      0.2     43.8      for element in str_lst:
 4   1000000     306755.0      0.3     56.2          result += element
 5         1          0.0      0.0      0.0      return result

result for foo2 function

Total time: 30.6663 s
File: <ipython-input-40-34dd53670dd9>
Function: foo2 at line 1
Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
 1                                           def foo2(str_lst):
 2         1          2.0      2.0      0.0      result = ''
 3   1000001     299122.0      0.3      1.0      for element in str_lst:
 4   1000000   30033127.0     30.0     97.7          result += element
 5   1000000     413283.0      0.4      1.3          temp = result
 6         1          0.0      0.0      0.0      return result

Somehow temp = result line affects the performance of result += element line.

解决方案

Having another name pointing to the same object kills the optimisation. The optimisation basically works by resizing the string object and appending in place. If you have more than one references to that object, you can't resize without affecting the other reference. With strings being immutable, allowing this would be a serious flaw of the implementation.

temp = result

increased the reference count for the string object named by result thereby prohibiting the optimisation.

The full list of checks performed in the case of += (which eventually translates to PyUnicode_Append) can be seen in the unicode_modifiable function. Among other things, it checks that the reference count of the object is equal to one, that it isn't interned and that it isn't a string subclass.

There's a couple more checks in the if statement guarding this optimisation, if you want a more thorough list.


Though not the basic issue of your question, future readers might be curious about how to efficiently perform string concatenations. Besides similar questions on S.O, the Python FAQ also has an entry on this.

这篇关于Python字符串串联内部细节的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆