连接字符串.生成器或列表理解? [英] Joining strings. Generator or list comprehension?

查看:46
本文介绍了连接字符串.生成器或列表理解?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

考虑从一个巨大的字符串中提取字母的问题.

Consider the problem of extracting alphabets from a huge string.

一种方法是

''.join([c for c in hugestring if c.isalpha()])

机制很明确:列表推导式生成字符列表.join 方法通过访问列表的长度来知道它需要连接多少个字符.

The mechanism is clear: The list comprehension generates a list of characters. The join method knows how many characters it needs to join by accessing the length of the list.

其他方法是

''.join(c for c in hugestring if c.isalpha())

这里的生成器理解导致生成器.join 方法不知道它要连接多少个字符,因为生成器没有 len 属性.所以这种join方式应该比list comprehension方法慢.

Here the generator comprehension results in a generator. The join method does not know how many characters it is going to join because the generator does not possess len attribute. So this way of joining should be slower than the list comprehension method.

但是在python中测试表明它并不慢.为什么会这样?谁能解释 join 在生成器上是如何工作的.

But testing in python shows that it is not slower. Why is this so? Can anyone explain how join works on a generator.

要清楚:

sum(j for j in range(100))

不需要知道 100,因为它可以跟踪累积和.它可以使用生成器上的 next 方法访问下一个元素,然后添加到累积总和中.然而,由于字符串是不可变的,累积地连接字符串会在每次迭代中创建一个新字符串.所以这需要很多时间.

doesn't need to have any knowledge of 100 because it can keep track of the cumulative sum. It can access the next element using the next method on the generator and then add to the cumulative sum. However, since strings are immutable, joining strings cumulatively would create a new string in each iteration. So this would take lot of time.

推荐答案

当你调用 str.join(gen) 其中 gen 是一个生成器时,Python 会做等效的事情list(gen) 在继续检查结果序列的长度之前.

When you call str.join(gen) where gen is a generator, Python does the equivalent of list(gen) before going on to examine the length of the resulting sequence.

具体来说,如果你查看代码实现str.join 在 CPython 中,你会看到这个调用:

Specifically, if you look at the code implementing str.join in CPython, you'll see this call:

    fseq = PySequence_Fast(seq, "can only join an iterable");

PySequence_Fast 的调用会将 seq 参数转换为一个列表,如果它不是一个列表或元组.

The call to PySequence_Fast converts the seq argument into a list if it wasn't a list or tuple already.

因此,您呼叫的两个版本的处理方式几乎相同.在列表理解中,您自己构建列表并将其传递给 join.在生成器表达式版本中,您传入的生成器对象会在 join 开始时变成一个 list,其余代码对两个版本的操作相同..

So, the two versions of your call are handled almost identically. In the list comprehension, you're building the list yourself and passing it into join. In the generator expression version, the generator object you pass in gets turned into a list right at the start of join, and the rest of the code operates the same for both versions..

这篇关于连接字符串.生成器或列表理解?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆