扩展Python列表(例如l + = [1])是否保证是线程安全的? [英] Is extending a Python list (e.g. l += [1]) guaranteed to be thread-safe?

查看:162
本文介绍了扩展Python列表(例如l + = [1])是否保证是线程安全的?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我有整数i,则在多个线程上执行i += 1是不安全的:

If I have an integer i, it is not safe to do i += 1 on multiple threads:

>>> i = 0
>>> def increment_i():
...     global i
...     for j in range(1000): i += 1
...
>>> threads = [threading.Thread(target=increment_i) for j in range(10)]
>>> for thread in threads: thread.start()
...
>>> for thread in threads: thread.join()
...
>>> i
4858  # Not 10000

但是,如果我有一个列表l,则在多个线程上执行l += [1]似乎很安全:

However, if I have a list l, it does seem safe to do l += [1] on multiple threads:

>>> l = []
>>> def extend_l():
...     global l
...     for j in range(1000): l += [1]
...
>>> threads = [threading.Thread(target=extend_l) for j in range(10)]
>>> for thread in threads: thread.start()
...
>>> for thread in threads: thread.join()
...
>>> len(l)
10000

l += [1]是否保证是线程安全的?如果是这样,这适用于所有Python实现还是仅适用于CPython?

Is l += [1] guaranteed to be thread-safe? If so, does this apply to all Python implementations or just CPython?

似乎l += [1]是线程安全的,但l = l + [1]却不是...

It seems that l += [1] is thread-safe but l = l + [1] is not...

>>> l = []
>>> def extend_l():
...     global l
...     for j in range(1000): l = l + [1]
...
>>> threads = [threading.Thread(target=extend_l) for j in range(10)]
>>> for thread in threads: thread.start()
...
>>> for thread in threads: thread.join()
...
>>> len(l)
3305  # Not 10000

推荐答案

对此没有一个满意的;-)答案.对此没有任何保证,只需指出Python参考手册就不能保证原子性即可确认.

There isn't a happy ;-) answer to this. There's nothing guaranteed about any of it, which you can confirm simply by noting that the Python reference manual makes no guarantees about atomicity.

在CPython中,这是一个实用问题.正如effbot文章的摘录部分所述,

In CPython it's a matter of pragmatics. As a snipped part of effbot's article says,

从理论上讲,这意味着要进行精确的记帐就需要对PVM [Python虚拟机]字节码实现有准确的了解.

In theory, this means an exact accounting requires an exact understanding of the PVM [Python Virtual Machine] bytecode implementation.

那是事实. CPython专家知道L += [x]是原子的,因为他们知道以下所有内容:

And that's the truth. A CPython expert knows L += [x] is atomic because they know all of the following:

  • +=编译为INPLACE_ADD字节码.
  • 列表对象的INPLACE_ADD实现完全用C语言编写(执行路径上没有Python代码,因此在 个字节码之间不能释放GIL).
  • listobject.c中,INPLACE_ADD的实现是函数list_inplace_concat(),执行期间无需执行任何用户Python代码(如果这样做,则GIL可能会再次释放).
  • += compiles to an INPLACE_ADD bytecode.
  • The implementation of INPLACE_ADD for list objects is written entirely in C (no Python code is on the execution path, so the GIL can't be released between bytecodes).
  • In listobject.c, the implementation of INPLACE_ADD is function list_inplace_concat(), and nothing during its execution needs to execute any user Python code either (if it did, the GIL may again be released).

听起来似乎很难保持直截了当,但是对于effbot熟悉CPython内部知识的人(在他写这篇文章时),事实并非如此.实际上,鉴于这种知识的深度,这很明显;-)

That may all sound incredibly difficult to keep straight, but for someone with effbot's knowledge of CPython's internals (at the time he wrote that article), it really isn't. In fact, given that depth of knowledge, it's all kind of obvious ;-)

因此,作为 practmatics 的问题,CPython专家始终自由地依靠看起来是原子的操作应该真正是原子的",并且还指导了一些语言决策.例如,effbot列表中缺少一个操作(在他写完这篇文章后添加到该语言中):

So as a matter of pragmatics, CPython experts have always freely relied on that "operations that 'look atomic' should really be atomic", and that also guided some language decisions. For example, an operation missing from effbot's list (added to the language after he wrote that article):

x = D.pop(y) # or ...
x = D.pop(y, default)

(当时)赞成添加dict.pop()的一个参数恰好是显而易见的C实现是原子的,而在使用中(当时)的替代方案是:

One argument (at the time) in favor of adding dict.pop() was precisely that the obvious C implementation would be atomic, whereas the in-use (at the time) alternative:

x = D[y]
del D[y]

不是原子的(检索和删除是通过不同的字节码完成的,因此线程可以在它们之间进行切换).

was not atomic (the retrieval and the deletion are done via distinct bytecodes, so threads can switch between them).

但是文档从未 .pop()是原子的,也从未如此.这是同意成年人"的事情:如果您有足够的知识来熟练地利用这一点,那么您就不需要动手了.如果您不够熟练,则适用effbot文章的最后一句话:

But the docs never said .pop() was atomic, and never will. This is a "consenting adults" kind of thing: if you're expert enough to exploit this knowingly, you don't need hand-holding. If you're not expert enough, then the last sentence of effbot's article applies:

如有疑问,请使用互斥锁!

When in doubt, use a mutex!

出于务实的需要,核心开发人员将永远不会打破effbot在CPython中的示例(或D.pop()D.setdefault())的原子性.但是,其他实现完全没有义务模仿这些实用的选择.确实,由于在这些情况下的原子性取决于CPython特定形式的字节码,以及CPython使用只能在字节码之间释放的全局解释器锁的使用,因此 可能是其他实现模仿它们的真正痛苦.

As a matter of pragmatic necessity, core developers will never break the atomicity of effbot's examples (or of D.pop() or D.setdefault()) in CPython. Other implementations are under no obligation at all to mimic these pragmatic choices, though. Indeed, since atomicity in these cases relies on CPython's specific form of bytecode combined with CPython's use of a global interpreter lock that can only be released between bytecodes, it could be a real pain for other implementations to mimic them.

您永远不知道:将来的CPython版本也可能会删除GIL!我对此表示怀疑,但是从理论上讲这是可能的.但是,如果发生这种情况,我敢打赌,保留了GIL的并行版本也将得到维护,因为很多代码(尤其是用C编写的扩展模块)也依赖GIL来保证线程安全.

And you never know: some future version of CPython may remove the GIL too! I doubt it, but it's theoretically possible. But if that happens, I bet a parallel version retaining the GIL will be maintained too, because a whole lot of code (especially extension modules written in C) relies on the GIL for thread safety too.

值得重复:

如有疑问,请使用互斥锁!

When in doubt, use a mutex!

这篇关于扩展Python列表(例如l + = [1])是否保证是线程安全的?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆