对lru_cache装饰函数使用可变参数会带来什么困难? [英] What difficulties might arise from using mutable arguments to an `lru_cache` decorated function?

查看:146
本文介绍了对lru_cache装饰函数使用可变参数会带来什么困难?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在评论中:是否有一个装饰器来简单地缓存函数的返回值?

@gerrit指出了使用可变但可散列的对象到一个对象的问题。 functools.lru_cache 装饰器的功能:

@gerrit points out a problem with using mutable, but hashable, objects to to a function with the functools.lru_cache decorator:


如果我传递可哈希的,可变的参数,并在第一次调用函数后更改
对象的值,第二次调用将
返回更改后的对象,而不是原始对象。

If I pass a hashable, mutable argument, and change the value of the object after the first call of the function, the second call will return the changed, not the original, object. That is almost certainly not what the user wants.

据我了解,假设手动定义了可变对象的 __ hash __()函数以哈希成员变量(而不仅仅是使用对象的 id()是自定义对象的默认设置),更改参数对象将更改哈希,因此,对 lru_cache 装饰函数的第二次调用不应使用缓存。

From my understanding, assuming the __hash__() function of the mutable objects is manually defined to hash the member variables (and not just using the object's id() that is the default for custom objects), changing the argument object will change the hash, hence, a second call should to the lru_cache decorated function should not utilize the cache.

如果为可变参数正确定义了 __ hash __()函数,是否有任何无人参与的行为可能是由于使用对 lru_cache 装饰函数的可变参数而引起的?

If the __hash__() function is defined correctly for the mutable argument(s), is there any unattended behavior that can arise from using mutable arguments to lru_cache decorated functions?

推荐答案

我的评论是错误的/令人误解的,与 lru_cache 没有关系,但是与创建更通用的缓存功能的任何尝试有关。

My comment was wrong/misleading and does not relate to lru_cache, but to any attempt to create a caching function that works more generically.

我面临着对一个缓存功能的需求适用于输入和输出 NumPy 数组的函数,该数组是可变的且不可散列。由于 NumPy 数组不可散列,因此无法使用 functools.lru_cache 。我结束了这样的事情:

I was facing a need for a caching function that works for function that input and output NumPy arrays, which are mutable and not hashable. Because NumPy arrays are not hashable, I could not use functools.lru_cache. I ended up wroting something like this:

def mutable_cache(maxsize=10):
    """In-memory cache like functools.lru_cache but for any object

    This is a re-implementation of functools.lru_cache.  Unlike
    functools.lru_cache, it works for any objects, mutable or not.
    Therefore, it returns a copy and it is wrong if the mutable
    object has changed!  Use with caution!

    If you call the *resulting* function with a keyword argument
    'CLEAR_CACHE', the cache will be cleared.  Otherwise, cache is rotated
    when more than `maxsize` elements exist in the cache.  Additionally,
    if you call the resulting function with NO_CACHE=True, it doesn't
    cache at all.  Be careful with functions returning large objects.
    Everything is kept in RAM!

    Args:
        maxsize (int): Maximum number of return values to be remembered.

    Returns:
        New function that has caching implemented.
    """

    sentinel = object()
    make_key = functools._make_key

    def decorating_function(user_function):
        cache = {}
        cache_get = cache.get
        keylist = []  # don't make it too long

        def wrapper(*args, **kwds):
            if kwds.get("CLEAR_CACHE"):
                del kwds["CLEAR_CACHE"]
                cache.clear()
                keylist.clear()
            if kwds.get("NO_CACHE"):
                del kwds["NO_CACHE"]
                return user_function(*args, **kwds)
            elif "NO_CACHE" in kwds:
                del kwds["NO_CACHE"]
            key = str(args) + str(kwds)
            result = cache_get(key, sentinel)
            if result is not sentinel:
                # make sure we return a copy of the result; when a = f();
                # b = f(), users should reasonably expect that a is not b.
                return copy.copy(result)
            result = user_function(*args, **kwds)
            cache[key] = result
            keylist.append(key)
            if len(keylist) > maxsize:
                try:
                    del cache[keylist[0]]
                    del keylist[0]
                except KeyError:
                    pass
            return result

        return functools.update_wrapper(wrapper, user_function)

    return decorating_function

在我的第一个版本中,我省略了 copy.copy()函数(实际上应该是 copy.deepcopy()),如果我更改结果值然后重新调用缓存的函数,则会导致错误。添加 copy.copy()功能后,我意识到我在某些情况下会占用内存,这主要是因为我的函数计算对象,而不是总内存使用量,这不是-在Python中一般来说很简单(尽管如果限制为 NumPy 数组应该很容易)。因此,我在结果函数中添加了 NO_CACHE CLEAR_CACHE 关键字,它们的作用与建议相同。

In my first version, I had omitted the copy.copy() function (which should really be copy.deepcopy()), which led to bugs if I changed the resulting value and then recalled the cached-function. After I added the copy.copy() functionality, I realised that I was hogging memory for some cases, primarily because my function counts objects, not total memory usage, which is non-trivial to do in general in Python (although should be easy if limited to NumPy arrays). Therefore I added the NO_CACHE and CLEAR_CACHE keywords to the resulting function, which do what their names suggest.

编写并使用此功能后,我了解 functools.lru_cache 仅适用于具有以下功能的原因不止一个可散列的输入参数。需要使用可变参数的缓存功能的任何人都必须非常小心。

After writing and using this function, I understand there is more than one good reason for functools.lru_cache to work only for functions with hashable input arguments. Anyone needing a caching function that works with mutable arguments needs to be very careful.

这篇关于对lru_cache装饰函数使用可变参数会带来什么困难?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆