在Linux上使用Taskset的多核系统上的Python Global Interpreter Lock(GIL)解决方法? [英] Python Global Interpreter Lock (GIL) workaround on multi-core systems using taskset on Linux?

查看:136
本文介绍了在Linux上使用Taskset的多核系统上的Python Global Interpreter Lock(GIL)解决方法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以我刚刚看完有关Python全局解释器锁(GIL)的演讲 http://blip.tv/文件/2232410 .

So I just finished watching this talk on the Python Global Interpreter Lock (GIL) http://blip.tv/file/2232410.

要点是GIL是用于单核系统的相当不错的设计(Python本质上将线程处理/调度留给了操作系统).但这可能会在多核系统上造成严重后果,最终导致IO密集型线程被CPU密集型线程严重阻塞,上下文切换的开销,ctrl-C问题[*]等.

The gist of it is that the GIL is a pretty good design for single core systems (Python essentially leaves the thread handling/scheduling up to the operating system). But that this can seriously backfire on multi-core systems and you end up with IO intensive threads being heavily blocked by CPU intensive threads, the expense of context switching, the ctrl-C problem[*] and so on.

因此,由于GIL限制我们基本上在一个CPU上执行Python程序,所以我的想法是为什么不接受它,而只是在Linux上使用任务集将程序的亲和力设置为系统上的某个核心/cpu(特别是在一个在多核系统上运行多个Python应用程序的情况)?

So since the GIL limits us to basically executing a Python program on one CPU my thought is why not accept this and simply use taskset on Linux to set the affinity of the program to a certain core/cpu on the system (especially in a situation with multiple Python apps running on a multi-core system)?

因此,最终我的问题是:是否有人尝试在Linux上将Taskset与Python应用程序一起使用(特别是在Linux系统上运行多个应用程序时,以便多个内核可以与绑定到特定内核的一个或两个Python应用程序一起使用)和如果是这样,结果如何?这值得吗?对于某些工作负载,情况是否会变得更糟?我打算这样做并对其进行测试(基本上看该程序是否需要花费更多或更少的时间来运行),但希望能听到别人的经验.

So ultimately my question is this: has anyone tried using taskset on Linux with Python applications (especially when running multiple applications on a Linux system so that multiple cores can be used with one or two Python applications bound to a specific core) and if so what were the results? is it worth doing? Does it make things worse for certain workloads? I plan to do this and test it out (basically see if the program takes more or less time to run) but would love to hear from others as to your experiences.

添加:David Beazley(在链接的视频中进行演讲的人)指出,某些C/C ++扩展手动释放GIL锁定,并且如果这些扩展针对多核进行了优化(即科学或数字数据分析等), .),而不能获得多核处理数字扩展的好处,因为该扩展仅限于单个核(因此可能会显着降低程序速度),因此实际上会受到削弱.另一方面,如果您不使用此类扩展名

Addition: David Beazley (the guy giving the talk in the linked video) pointed out that some C/C++ extensions manually release the GIL lock and if these extensions are optimized for multi-core (i.e. scientific or numeric data analysis/etc.) then rather than getting the benefits of multi-core for number crunching the extension would be effectively crippled in that it is limited to a single core (thus potentially slowing your program down significantly). On the other hand if you aren't using extensions such as this

我不使用多处理模块的原因是(在这种情况下)该程序的一部分受网络I/O绑定(HTTP请求)的限制,因此拥有工作线程池是从中降低性能的一种绝佳方法因为一个线程会触发HTTP请求,然后再等待I/O,所以放弃了GIL,而另一个线程可以做到这一点,因此该程序的一部分可以轻松运行100个以上的线程,而又不会对CPU造成很大的伤害,我实际上使用了可用的网络带宽.至于无堆栈的Python/etc,我对重写程序或替换我的Python堆栈并不太感兴趣(可用性也将是一个问题).

The reason I am not using the multiprocessing module is that (in this case) part of the program is heavily network I/O bound (HTTP requests) so having a pool of worker threads is a GREAT way to squeeze performance out of a box since a thread fires off an HTTP request and then since it's waiting on I/O gives up the GIL and another thread can do it's thing, so that part of the program can easily run 100+ threads without hurting the CPU much and let me actually use the network bandwidth that is available. As for stackless Python/etc I'm not overly interested in rewriting the program or replacing my Python stack (availability would also be a concern).

[*]只有主线程可以接收信号,因此,如果您发送ctrl-C,Python解释器基本上会尝试使主线程运行,以便它可以处理信号,但是由于它不能直接控制哪个线程在运行(这留给操作系统)时,它基本上告诉OS保持切换线程,直到最终到达主线程为止(如果不走运,可能需要一段时间).

[*] Only the main thread can receive signals so if you send a ctrl-C the Python interpreter basically tries to get the main thread to run so it can handle the signal, but since it doesn't directly control which thread is run (this is left to the operating system) it basically tells the OS to keep switching threads until it eventually hits the main thread (which if you are unlucky may take a while).

推荐答案

我从未听说有人使用Taskset来提高Python的性能.并不是说您的情况不会发生,而是一定要发布您的结果,以便其他人可以批评您的基准测试方法并提供验证.

I have never heard of anyone using taskset for a performance gain with Python. Doesn't mean it can't happen in your case, but definitely publish your results so others can critique your benchmarking methods and provide validation.

不过,个人而言,我将使用消息队列将您的I/O线程与CPU绑定的线程解耦.这样,您的前端现在完全受网络I/O约束(有些使用HTTP接口,有些使用消息队列接口),非常适合您的线程情况.然后,CPU密集型进程可以使用多进程,也可以只是单个进程,等待工作到达消息队列.

Personally though, I would decouple your I/O threads from the CPU bound threads using a message queue. That way your front end is now completely network I/O bound (some with HTTP interface, some with message queue interface) and ideal for your threading situation. Then the CPU intense processes can either use multiprocessing or just be individual processes waiting for work to arrive on the message queue.

从长远来看,您可能还需要考虑用Twisted或

In the longer term you might also want to consider replacing your threaded I/O front-end with Twisted or some thing like eventlets because, even if they won't help performance they should improve scalability. Your back-end is now already scalable because you can run your message queue over any number of machines+cpus as needed.

这篇关于在Linux上使用Taskset的多核系统上的Python Global Interpreter Lock(GIL)解决方法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆