Python 3 Multiprocessing Pool在使用大变量时速度较慢 [英] Python 3 Multiprocessing Pool is slow with large variables

查看：638 发布时间：2020/5/13 19:54:32 python multiprocessing

本文介绍了Python 3 Multiprocessing Pool在使用大变量时速度较慢的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

在Python 3中使用多处理池时，我遇到了一个非常特殊的问题.请参见下面的代码:

I'm running into a very peculiar issue with using multiprocessing pools in Python 3... See the code below:

import multiprocessing as MP                                       

class c(object):                                                   
    def __init__(self):                                            
        self.foo = ""                                              

    def a(self, b):                                                
        return b                                                   

    def main(self):                                                
        with open("/path/to/2million/lines/file", "r") as f:
            self.foo = f.readlines()                               

o = c()                                                            
o.main()                                                           
p = MP.Pool(5)                                                     
for r in p.imap(o.a, range(1,10)):                                 
    print(r)

如果我按原样执行此代码，这将是我极其缓慢的结果:

If I execute this code as is, this is my extremely slow result:

1
2
3
4
5
6
7
8
9

real    0m6.641s
user    0m7.256s
sys     0m1.824s

但是，如果我删除了o.main()行，则执行时间将大大缩短:

However, if i removed the line o.main(), then I get much faster execution time:

1
2
3
4
5
6
7
8
9

real    0m0.155s
user    0m0.048s
sys     0m0.004s

我的环境有足够的力量，并且我确保我没有遇到任何内存限制.我还用一个较小的文件进行了测试，执行时间更可接受.有见识吗?

My environment has plenty of power, and I've made sure I'm not running into any memory limits. I also tested it with a smaller file, and execution time is much more acceptable. Any insight?

我删除了磁盘IO部分，而只是创建了一个列表.我可以证明磁盘IO与该问题无关...

I removed the disk IO part, and just created a list instead. I can prove the disk IO has nothing to do with the problem...

for i in range(1,500000):
    self.foo.append("foobar%d\n"%i)

real    0m1.763s user    0m1.944s sys     0m0.452s

for i in range(1,1000000):
    self.foo.append("foobar%d\n"%i)
real    0m3.808s user    0m4.064s sys     0m1.016s

Python 3 Multiprocessing Pool在使用大变量时速度较慢 [英] Python 3 Multiprocessing Pool is slow with large variables

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Python 3 Multiprocessing Pool在使用大变量时速度较慢 [英] Python 3 Multiprocessing Pool is slow with large variables

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭