为什么字符串连接在 Python 中很重要? [英] Why does string concatenation matter in Python?

查看:47
本文介绍了为什么字符串连接在 Python 中很重要?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我看过很多帖子(例如这里此处) 谈论 Python 中的连接,如何最好地做到这一点('+' vs ','),这更快,等等.但我似乎无法找出为什么这很重要.Roger Pate 在第一个例子中提到了传递多个参数 vs 一个参数,但我仍然不清楚.

那么,为什么串联很重要?什么是至关重要的用例?

解决方案

字符串在 Python 中是不可变对象,因此您不能修改现有字符串.这意味着字符串的每次连接都会导致创建一个新的字符串对象,并丢弃两个(源对象).内存分配非常昂贵,足以让这件事变得很重要.

因此,当您知道需要连接多个字符串时,请将它们存储在列表中.然后最后,只需一次,使用 ''.join(list_of_strings) 加入该列表.这样,一个新的字符串只会被创建一次.

请注意,这也适用于其他语言.例如,Java 和 C# 都有一个 StringBuilder 类型,它们本质上是相同的.只要您不断添加新的字符串部分,它们就会在内部将其添加到字符串中,并且只有当您将构建器转换为真正的字符串时,连接才会发生——而且只发生一次.

另请注意,当您在一行中添加几个字符串时,这种内存分配开销已经发生了.例如 a + b + c + d 将创建三个中间字符串.如果您查看该表达式的字节码,您可以看到:

<预><代码>>>>dis.dis('a + b + c + d')1 0 LOAD_NAME 0 (a)3 LOAD_NAME 1 (b)6 BINARY_ADD7 LOAD_NAME 2 (c)10 BINARY_ADD11 LOAD_NAME 3 (d)14 BINARY_ADD15 RETURN_VALUE

每个 BINARY_ADD 连接前两个值并为堆栈上的结果创建一个新对象.请注意,对于常量字符串文字,编译器足够聪明,可以注意到您正在添加常量:

<预><代码>>>>dis.dis('"foo" + "bar" + "baz"')1 0 LOAD_CONST 4 ('foobarbaz')3 RETURN_VALUE

如果你确实有一些可变部分——例如,如果你想产生一个格式很好的输出——那么你又回到创建中间字符串对象了.在这种情况下,使用 str.format 是个好主意,例如'foo{} bar {} baz'.format(a, b).

I've seen a good number of posts (examples here and here) speaking about concatenation in Python, how best to do it ('+' vs ','), which is faster, etc. But I can't seem to find out why it matters. Roger Pate mentioned in the first example about passing multiple arguments vs one, but I'm still unclear.

So, why does concatenation matter? What is a use case where this would be critical?

解决方案

Strings are immutable objects in Python, so you cannot modify existing strings. That means that every concatenation of a string results in a new string object being created and two (the source objects) being thrown away. Memory allocation is expensive enough to make this matter a lot.

So when you know you need to concatenate multiple strings, store them in a list. And then at the end, just once, join that list using ''.join(list_of_strings). That way, a new string will only be created once.

Note that this also applies in other languages. For example Java and C# both have a StringBuilder type which is essentially the same. As long as you keep appending new string parts, they will just internally add that to a string, and only when you convert the builder into a real string, the concatenation happens—and again just once.

Also note, that this memory allocation overhead already happens when you just append a few strings in a single line. For example a + b + c + d will create three intermediary strings. You can see that, if you look at the byte code for that expression:

>>> dis.dis('a + b + c + d')
  1           0 LOAD_NAME                0 (a) 
              3 LOAD_NAME                1 (b) 
              6 BINARY_ADD           
              7 LOAD_NAME                2 (c) 
             10 BINARY_ADD           
             11 LOAD_NAME                3 (d) 
             14 BINARY_ADD           
             15 RETURN_VALUE         

Each BINARY_ADD concats the two previous values and creates a new object for the result on the stack. Note that for constant string literals, the compiler is smart enough to notice that you are adding constants:

>>> dis.dis('"foo" + "bar" + "baz"')
  1           0 LOAD_CONST               4 ('foobarbaz') 
              3 RETURN_VALUE         

If you do have some variable parts within that though—for example if you want to produce a nicely formatted output—then you are back to creating intermediary string objects. In that case, using str.format is a good idea, e.g. 'foo {} bar {} baz'.format(a, b).

这篇关于为什么字符串连接在 Python 中很重要?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆