import在Python线程中 [英] import inside of a Python thread

查看:210
本文介绍了import在Python线程中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一些函数交互式加载python模块使用 __ import __



我最近偶然发现了一些关于import lock在Python中,也就是说,一个专门用于导入的锁(不仅仅是GIL)。



这让我想知道在线程中导入的实践。


  1. import / __ import __ 线程安全吗?
  2. $


  3. b $ b

    编辑2012年9月12日



    感谢您的回复Soravux。
    所以import是线程安全的,我不担心死锁,因为在我的代码中使用 __ import __ 的函数不会相互调用。 p>

    你知道即使模块已经导入,锁是否被获取?
    如果是这样,我应该在sys.modules中查看模块是否已经在调用 __ import __ 之前已经被导入。



    当然这不应该在CPython中有很大的区别,因为有GIL。



    编辑2012年9月19日














    $ b

    关于Jython,这里是他们在文档中说的:



    http://www.jython.org/jythonbook/en/1.0/Concurrency.html#module-import-lock


    然而,Python却定义了一个模块导入锁,它是由Jython实现的
    。每当导入
    名称时,都会获取此锁。这是真的无论导入是通过import
    语句,等效的 __ import __ 内置或相关代码。需要注意的是
    很重要,即使相应的模块已经导入了
    ,模块导入锁仍然被获取,如果只有


    < blockquote>

    所以,似乎在进行导入之前检入sys.modules是有意义的,以避免获取锁。你认为什么?

    解决方案

    正常导入是线程安全的,因为它们在执行前获取导入锁定,已经完成了。如果使用可用的挂钩添加自己的自定义导入,请务必向其中添加此锁定方案。 Python中的锁定功能可以通过 imp 模块( imp.lock_held() / acquire_lock() / release_lock())。



    t除了已知



    创建线程的低级调用 clone 然后是Python的线程类似操作。分叉和克隆在不同的内存段上应用不同的行为。例如,只有堆栈不被线程共享,相比克隆更多段(数据(通常是COW),堆栈,代码,堆)的叉子,有效地不共享其内容。 Python中的导入机制使用不是放置在堆栈上的全局命名空间,因此使用与其线程的共享段。由于导入的副作用(即存储器中的变化)在同一段中起作用,所以它表现为单线程程序。但是在多线程程序的导入中要小心使用线程安全库。



    顺便说一下,Python中的线程程序会遇到这样的问题: 会导致混乱在这种环境中使用对线程不安全的函数的调用。 GIL ,除非您的程序是I / O绑定或依赖C或外部线程安全库(因为它们在执行之前释放GIL)。在两个线程中运行相同的导入函数将不会同时执行,因为这个GIL。注意,这只是CPython的限制,Python的其他实现将有不同的行为。



    要回答你的编辑:导入的模块都是由Python缓存的。如果模块已经加载到缓存中,它将不会再次运行,导入语句(或函数)将立即返回。你不必在sys.modules中实现自己的缓存查找,Python为你做这件事,除了sys的GIL之外,不会 imp 锁定任何东西。模块查找。



    回答你的第二个编辑:我喜欢维护一个更简单的代码,而不是优化调用我使用的库(在这种情况下,标准库)。原理是,执行某事所需的时间通常比导入所执行的模块所需的时间更重要。此外,在整个项目中维护这种代码所需的时间比执行所需的时间要高。这一切归结为:程序员时间比CPU时间更有价值。


    I have some functions that interactively load python modules using __import__

    I recently stumbled upon some article about an "import lock" in Python, that is, a lock specifically for imports (not just the GIL). But the article was old so maybe that's not true anymore.

    This makes me wonder about the practice of importing in a thread.

    1. Are import/__import__ thread safe?
    2. Can they create dead locks?
    3. Can they cause performance issues in a threaded application?

    EDIT 12 Sept 2012

    Thanks for the great reply Soravux. So import are thread safe, and I'm not worrying about deadlocks, since the functions that use __import__ in my code don't call each others.

    Do you know if the lock is acquired even if the module has already been imported ? If that is the case, I should probably look in sys.modules to check if the module has already been imported before making a call to __import__.

    Sure this shouldn't make a lot of difference in CPython since there is the GIL anyway. However it could make a lot of difference on other implementations like Jython or stackless python.

    EDIT 19 Sept 2012

    About Jython, here's what they say in the doc:

    http://www.jython.org/jythonbook/en/1.0/Concurrency.html#module-import-lock

    Python does, however, define a module import lock, which is implemented by Jython. This lock is acquired whenever an import of any name is made. This is true whether the import goes through the import statement, the equivalent __import__ builtin, or related code. It’s important to note that even if the corresponding module has already been imported, the module import lock will still be acquired, if only briefly.

    So, it seems that it would make sense to check in sys.modules before making an import, to avoid acquiring the lock. What do you think?

    解决方案

    Normal imports are thread safe because they acquire an import lock prior to execution and release it once the import is done. If you add your own custom imports using the hooks available, be sure to add this locking scheme to it. Locking facilities in Python may be accessed by the imp module (imp.lock_held()/acquire_lock()/release_lock()).

    Using this import lock won't create any deadlocks or dependency errors aside from the circular dependencies that are already known.

    The low-level call to create a thread being clone on Linux, threading in Python is then a fork-like operation. Forking and cloning applies different behaviors on the various memory segments. For exemple, only the stack is not shared by threads, compared to forks which clones more segments (Data (often COW), Stack, Code, Heap), effectively not sharing its content. The import mechanism in Python uses the global namespace which is not placed on the stack, thus using a shared segment with its threads. Since the side-effects (ie. the changes in memory) of importing works in the same segments, it behaves as a single-threaded program. Be careful to use thread safe libraries in your imports on multithreaded programs, though. It will cause mayhem to use calls to functions that are not thread safe in such environment.

    By the way, threaded programs in Python suffers the GIL which won't allow much performance gains unless your program is I/O bound or rely on C or external thread-safe libraries (since they release the GIL before executing). Running in two threads the same imported function won't execute concurrently because of this GIL. Note that this is only a limitation of CPython and other implementations of Python will have a different behavior.

    To answer your edit: imported modules are all cached by Python. If the module is already loaded in the cache, it won't be run again and the import statement (or function) will return right away. You don't have to implement yourself the cache lookup in sys.modules, Python does that for you and won't imp lock anything, aside from the GIL for the sys.modules lookup.

    To answer your second edit: I prefer having to maintain a simpler code than trying to optimize calls to the libraries I use (in this case, the standard library). The rationale is that the time required to perform something is usually way more important than the time required to import the module that does it. Furthermore, the time required to maintain this kind of code throughout the project is way higher than the time it will take to execute. It all boils down to: "programmer time is more valuable than CPU time".

    这篇关于import在Python线程中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆