内存使用率随着Python的multiprocessing.pool的增长而增长 [英] Memory usage keep growing with Python's multiprocessing.pool

查看：542 发布时间：2020/5/8 19:01:31 python memory multiprocessing pool

本文介绍了内存使用率随着Python的multiprocessing.pool的增长而增长的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

这是程序:

#!/usr/bin/python

import multiprocessing

def dummy_func(r):
    pass

def worker():
    pass

if __name__ == '__main__':
    pool = multiprocessing.Pool(processes=16)
    for index in range(0,100000):
        pool.apply_async(worker, callback=dummy_func)

    # clean up
    pool.close()
    pool.join()

我发现内存使用量(VIRT和RES)一直持续增长，直到close()/join()，有什么解决方案可以摆脱这种情况?我用2.7尝试了maxtasksperchild，但也无济于事.

I found memory usage (both VIRT and RES) kept growing up till close()/join(), is there any solution to get rid of this? I tried maxtasksperchild with 2.7 but it didn't help either.

我有一个更复杂的程序，调用apply_async()约600万次，并且在〜1.5M时我已经拥有6G + RES，为避免所有其他因素，我将该程序简化为上述版本.

I have a more complicated program that calles apply_async() ~6M times, and at ~1.5M point I've already got 6G+ RES, to avoid all other factors, I simplified the program to above version.

事实证明，此版本的效果更好，感谢大家的投入:

Turned out this version works better, thanks for everyone's input:

#!/usr/bin/python

import multiprocessing

ready_list = []
def dummy_func(index):
    global ready_list
    ready_list.append(index)

def worker(index):
    return index

if __name__ == '__main__':
    pool = multiprocessing.Pool(processes=16)
    result = {}
    for index in range(0,1000000):
        result[index] = (pool.apply_async(worker, (index,), callback=dummy_func))
        for ready in ready_list:
            result[ready].wait()
            del result[ready]
        ready_list = []

    # clean up
    pool.close()
    pool.join()

我没有在其中放置任何锁，因为我认为主要过程是单线程的(根据我阅读的文档，回调或多或少像是事件驱动的东西).

I didn't put any lock there as I believe main process is single threaded (callback is more or less like a event-driven thing per docs I read).

我将v1的索引范围更改为1,000,000，与v2相同，并进行了一些测试-我觉得v2比v1快10％(33s vs 37s)，也许v1做太多内部列表维护工作. v2绝对是内存使用方面的赢家，它从未超过300M(VIRT)和50M(RES)，而v1曾经是370M/120M，最好的是330M/85M.所有数字仅为测试的3到4倍，仅供参考.

I changed v1's index range to 1,000,000, same as v2 and did some tests - it's weird to me v2 is even ~10% faster than v1 (33s vs 37s), maybe v1 was doing too many internal list maintenance jobs. v2 is definitely a winner on memory usage, it never went over 300M (VIRT) and 50M (RES), while v1 used to be 370M/120M, the best was 330M/85M. All numbers were just 3~4 times testing, reference only.

内存使用率随着Python的multiprocessing.pool的增长而增长 [英] Memory usage keep growing with Python's multiprocessing.pool

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

内存使用率随着Python的multiprocessing.pool的增长而增长 [英] Memory usage keep growing with Python&#39;s multiprocessing.pool

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

内存使用率随着Python的multiprocessing.pool的增长而增长 [英] Memory usage keep growing with Python's multiprocessing.pool

登录关闭