Python代码的性能随线程而降低 [英] Python code performance decreases with threading

查看:76
本文介绍了Python代码的性能随线程而降低的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经用Python编写了一个工作程序,该程序基本上可以解析一批二进制文件,然后将数据提取到数据结构中.每个文件需要大约一秒钟的时间来解析,这意味着成千上万个文件要花费数小时.我已经成功实现了线程解析的线程版本的批处理解析方法.我在具有不同线程数的100个文件上测试了该方法,并确定了每次运行的时间.结果如下(0个线程是指我原始的预线程代码,1个线程是指生成一个线程的新版本).

0 threads: 83.842 seconds
1 threads: 78.777 seconds
2 threads: 105.032 seconds
3 threads: 109.965 seconds
4 threads: 108.956 seconds
5 threads: 109.646 seconds
6 threads: 109.520 seconds
7 threads: 110.457 seconds
8 threads: 111.658 seconds

尽管产生一个线程比让主线程完成所有工作会带来较小的性能提升,但实际上会增加降低性能的线程数量.我本来希望看到性能提高,至少可以达到四个线程(我的机器的每个内核一个).我知道线程会带来相关的开销,但是我认为这与单个位数的线程无关紧要.

我听说过全局解释器锁",但是当我向上移动四个线程时,我确实看到了相应数量的工作内核-两个线程,两个内核在解析期间显示活动,依此类推. /p>

我还测试了一些不同版本的解析代码,以查看我的程序是否与IO绑定.似乎并非如此;仅读取文件所需的时间就相对较少;处理文件几乎是全部.如果不执行IO并处理文件的已读取版本,则添加第二个线程会损害性能,而第三个线程会稍微改善性能.我只是想知道为什么我不能利用计算机的多核来加快处理速度.请发表任何问题或我可以澄清的方式.

解决方案

这是CPython中的情况,主要是由于全局解释器锁定(GIL).受CPU约束的Python代码根本无法跨线程扩展(另一方面,受I/O约束的代码可能会在一定程度上扩展).

David Beazley有一个内容丰富的演示文稿,他在其中讨论了一些问题围绕GIL.可以找到视频 multiprocessing 模块,而不要使用多个线程.

I've written a working program in Python that basically parses a batch of binary files, extracting data into a data structure. Each file takes around a second to parse, which translates to hours for thousands of files. I've successfully implemented a threaded version of the batch parsing method with an adjustable number of threads. I tested the method on 100 files with a varying number of threads, timing each run. Here are the results (0 threads refers to my original, pre-threading code, 1 threads to the new version run with a single thread spawned).

0 threads: 83.842 seconds
1 threads: 78.777 seconds
2 threads: 105.032 seconds
3 threads: 109.965 seconds
4 threads: 108.956 seconds
5 threads: 109.646 seconds
6 threads: 109.520 seconds
7 threads: 110.457 seconds
8 threads: 111.658 seconds

Though spawning a thread confers a small performance increase over having the main thread do all the work, increasing the number of threads actually decreases performance. I would have expected to see performance increases, at least up to four threads (one for each of my machine's cores). I know threads have associated overhead, but I didn't think this would matter so much with single-digit numbers of threads.

I've heard of the "global interpreter lock", but as I move up to four threads I do see the corresponding number of cores at work--with two threads two cores show activity during parsing, and so on.

I also tested some different versions of the parsing code to see if my program is IO bound. It doesn't seem to be; just reading in the file takes a relatively small proportion of time; processing the file is almost all of it. If I don't do the IO and process an already-read version of a file, I adding a second thread damages performance and a third thread improves it slightly. I'm just wondering why I can't take advantage of my computer's multiple cores to speed things up. Please post any questions or ways I could clarify.

解决方案

This is sadly how things are in CPython, mainly due to the Global Interpreter Lock (GIL). Python code that's CPU-bound simply doesn't scale across threads (I/O-bound code, on the other hand, might scale to some extent).

There is a highly informative presentation by David Beazley where he discusses some of the issues surrounding the GIL. The video can be found here (thanks @Ikke!)

My recommendation would be to use the multiprocessing module instead of multiple threads.

这篇关于Python代码的性能随线程而降低的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆