触发Dask员工释放记忆 [英] Trigger Dask workers to release memory

查看:67
本文介绍了触发Dask员工释放记忆的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Dask分发某些函数的计算.我的总体布局如下所示:

I'm distributing the computation of some functions using Dask. My general layout looks like this:


    from dask.distributed import Client, LocalCluster, as_completed

    cluster = LocalCluster(processes=config.use_dask_local_processes,
                           n_workers=1,
                           threads_per_worker=1,
                           )
    client = Client(cluster)
    cluster.scale(config.dask_local_worker_instances)

    work_futures = []

    # For each group do work
    for group in groups:
        fcast_futures.append(client.submit(_work, group))

    # Wait till the work is done
    for done_work in as_completed(fcast_futures, with_results=False):
        try:
            result = done_work.result()
        except Exception as error:
            log.exception(error)

我的问题是,对于大量作业,我倾向于达到内存限制.我看到很多:

My issue is that for a large number of jobs I tend to hit memory limits. I see a lot of:

distributed.worker - WARNING - Memory use is high but worker has no data to store to disk.  Perhaps some other process is leaking memory?  Process memory: 1.15 GB -- Worker memory limit: 1.43 GB

似乎每个未来都不会释放自己的记忆.我该如何触发呢?我在Python 2.7上使用dask == 1.2.0.

It seems that each future isn't releasing its memory. How can I trigger that? I'm using dask==1.2.0 on Python 2.7.

推荐答案

只要客户机上有指望它的结果,调度程序就会为您提供帮助.当最后的将来(或不久之后)由python进行垃圾收集时,将释放内存.在您的情况下,您要在整个计算过程中将所有期货保存在列表中.您可以尝试修改循环:

Results are help by the scheduler so long as there is a future on a client pointing to it. Memory is released when (or shortly after) the last future is garbage-collected by python. In your case you are keeping all of your futures in a list throughout the computation. You could try modifying your loop:

for done_work in as_completed(fcast_futures, with_results=False):
    try:
        result = done_work.result()
    except Exception as error:
        log.exception(error)    
    done_work.release()

或以某种处理替换 as_completed 循环,以在处理完期货后将其从列表中明确删除.

or replacing the as_completed loop with something that explicitly removes futures from the list once they have been processed.

这篇关于触发Dask员工释放记忆的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆