Python 中 GIL 的新实现是否处理了竞争条件问题? [英] Does new implementation of GIL in Python handled race condition issue?

查看:88
本文介绍了Python 中 GIL 的新实现是否处理了竞争条件问题?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已阅读一篇关于多线程的文章在 Python 中,他们尝试使用同步来解决竞争条件问题.我已经运行下面的示例代码来重现竞争条件问题:

导入线程# 全局变量 xx = 0定义增量():"""增加全局变量 x 的函数"""全球 xx += 1定义线程任务():"""线程任务调用增量函数 100000 次."""对于 _ 范围内(100000):增量()def main_task():全球 x# 设置全局变量 x 为 0x = 0# 创建线程t1 = threading.Thread(target=thread_task)t2 = threading.Thread(target=thread_task)# 启动线程t1.start()t2.start()# 等待线程完成它们的工作t1.join()t2.join()如果 __name__ == "__main__":对于范围内的我(10):主要任务()打印(迭代{0}:x = {1}".格式(i,x))

当我使用 Python 2.7.15 时,它确实返回与文章相同的结果.但是当我使用 Python 3.6.9 时它不会(所有线程返回相同的结果 = 200000).

我想知道 GIL 的新实现(自 Python 3.2 起)是否处理​​了竞争条件问题?如果是这样,为什么 Lock, Mutex 在 Python >3.2 中仍然存在.如果不是,为什么像上面的例子那样运行多线程修改共享资源时没有冲突?

这些天来,当我试图更多地了解 Python 的真正工作原理时,我的大脑一直在为这些问题苦苦挣扎.

解决方案

您所指的更改是将检查间隔替换为开关间隔.这意味着不是每 100 字节代码切换线程,而是每 5 毫秒切换一次.

参考:https://pymotw.com/3/sys/threads.html https://mail.python.org/pipermail/python-dev/2009-October/093321.html

因此,如果您的代码运行得足够快,它将永远不会遇到线程切换,而且您可能会认为这些操作是原子的,但实际上并非如此.没有出现竞争条件,因为没有实际的线程交织.x += 1 其实就是四字节码:

<预><代码>>>>dis.dis(sync.increment)11 0 LOAD_GLOBAL 0 (x)3 LOAD_CONST 1 (1)6 INPLACE_ADD7 STORE_GLOBAL 0 (x)10 LOAD_CONST 2(无)13 RETURN_VALUE

解释器中的线程切换可以发生在任意两个字节码之间.

考虑到在 2.7 中这总是打印 200000,因为检查间隔设置得太高以至于每个线程在下一次运行之前全部完成.同样可以用开关间隔构造.

导入系统进口螺纹打印(sys.getcheckinterval())sys.setcheckinterval(1000000)# 全局变量 xx = 0定义增量():"""增加全局变量 x 的函数"""全球 xx += 1定义线程任务():"""线程任务调用增量函数 100000 次."""对于 _ 范围内(100000):增量()def main_task():全球 x# 设置全局变量 x 为 0x = 0# 创建线程t1 = threading.Thread(target=thread_task)t2 = threading.Thread(target=thread_task)# 启动线程t1.start()t2.start()# 等待线程完成它们的工作t1.join()t2.join()如果 __name__ == "__main__":对于范围内的我(10):主要任务()打印(迭代{0}:x = {1}".格式(i,x))

I've read an article about multithreading in Python where they trying to use Synchronization to solve race condition issue. And I've run the example code below to reproduce race condition issue:

import threading 

# global variable x 
x = 0

def increment(): 
    """ 
    function to increment global variable x 
    """
    global x 
    x += 1

def thread_task(): 
    """ 
    task for thread 
    calls increment function 100000 times. 
    """
    for _ in range(100000): 
        increment() 

def main_task(): 
    global x 
    # setting global variable x as 0 
    x = 0

    # creating threads 
    t1 = threading.Thread(target=thread_task) 
    t2 = threading.Thread(target=thread_task) 

    # start threads 
    t1.start() 
    t2.start() 

    # wait until threads finish their job 
    t1.join() 
    t2.join() 

if __name__ == "__main__": 
    for i in range(10): 
        main_task() 
        print("Iteration {0}: x = {1}".format(i,x)) 

It does return the same result as the article when I'm using Python 2.7.15. But it does not when I'm using Python 3.6.9 (all threads return the same result = 200000).

I wonder that does new implementation of GIL (since Python 3.2) was handled race condition issue? If it does, why Lock, Mutex still exist in Python >3.2 . If it doesn't, why there is no conflict when running multi threading to modify shared resource like the example above?

My mind was struggling with those question in these days when I'm trying to understand more about how Python really works under the hood.

解决方案

The change you are referring to was to replace check interval with switch interval. This meant that rather than switching threads every 100 byte codes it would do so every 5 milliseconds.

Ref: https://pymotw.com/3/sys/threads.html https://mail.python.org/pipermail/python-dev/2009-October/093321.html

So if your code ran fast enough, it would never experience a thread switch and it might appear to you that the operations were atomic when they are in fact not. The race condition did not appear as there was no actual interweaving of threads. x += 1 is actually four byte codes:

>>> dis.dis(sync.increment)
 11           0 LOAD_GLOBAL              0 (x)
              3 LOAD_CONST               1 (1)
              6 INPLACE_ADD         
              7 STORE_GLOBAL             0 (x)
             10 LOAD_CONST               2 (None)
             13 RETURN_VALUE        

A thread switch in the interpreter can occur between any two bytecodes.

Consider that in 2.7 this prints 200000 always because the check interval is set so high that each thread completes in its entirety before the next runs. The same can be constructed with switch interval.

import sys
import threading 

print(sys.getcheckinterval())
sys.setcheckinterval(1000000)

# global variable x 
x = 0

def increment(): 
    """ 
    function to increment global variable x 
    """
    global x 
    x += 1

def thread_task(): 
    """ 
    task for thread 
    calls increment function 100000 times. 
    """
    for _ in range(100000): 
        increment() 

def main_task(): 
    global x 
    # setting global variable x as 0 
    x = 0

    # creating threads 
    t1 = threading.Thread(target=thread_task) 
    t2 = threading.Thread(target=thread_task) 

    # start threads 
    t1.start() 
    t2.start() 

    # wait until threads finish their job 
    t1.join() 
    t2.join() 

if __name__ == "__main__": 
    for i in range(10): 
        main_task() 
        print("Iteration {0}: x = {1}".format(i,x)) 

这篇关于Python 中 GIL 的新实现是否处理了竞争条件问题?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆