多个进程共享一个Joblib缓存 [英] Multiple processes sharing a single Joblib cache

查看:226
本文介绍了多个进程共享一个Joblib缓存的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Joblib在我的python脚本中缓存一个计算量大的函数的结果.该函数的输入参数和返回值是numpy数组.缓存可以很好地运行我的python脚本.现在,我想并行生成多个python脚本运行,以便在实验中清除某些参数. (该函数的定义在所有运行中均保持不变).

I'm using Joblib to cache results of a computationally expensive function in my python script. The function's input arguments and return values are numpy arrays. The cache works fine for a single run of my python script. Now I want to spawn multiple runs of my python script in parallel for sweeping some parameter in an experiment. (The definition of the function remains same across all the runs).

是否有一种方法可以在多个并行运行的python脚本之间共享joblib缓存?这样可以节省很多函数评估,这些评估在不同的运行中重复进行,但不会在一次运行中重复进行.我在 Joblib的文档

Is there a way to share the joblib cache among multiple python scripts running in parallel? This would save a lot of function evaluations which are repeated across different runs but do not repeat within a single run. I couldn't find if this is possible in Joblib's documentation

推荐答案

指定一个固定的通用cachedir并装饰要使用的缓存功能

Specify a common, fixed cachedir and decorate the function that you want to cache using

from joblib import Memory
mem = Memory(cachedir=cachedir)

@mem.cache
def f(arguments):
    """do things"""
    pass

或者简单地

def g(arguments):
   pass

cached_g = mem.cache(g)

然后,即使您跨进程,跨机器工作,如果程序的所有实例都可以访问cachedir,则可以将公用函数调用透明地缓存在其中.

Then, even if you are working across processes, across machines, if all instances of your program have access to cachedir, then common function calls can be cached there transparently.

这篇关于多个进程共享一个Joblib缓存的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆