如何记忆**变态? [英] How to memoize **kwargs?

查看:161
本文介绍了如何记忆**变态?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我还没有找到一种可以记忆带有关键字参数(即某种类型的参数)的函数的既定方法

I haven't seen an established way to memoize a function that takes key-word arguments, i.e. something of type

def f(*args, **kwargs)

因为通常备忘录器具有dict来缓存给定一组输入参数的结果,并且kwargsdict,因此无法哈希.我尝试了以下讨论,在此处中,使用

since typically a memoizer has a dict to cache results for a given set of input parameters, and kwargs is a dict and hence unhashable. I have tried, following discussions here, using

(args, frozenset(kwargs.items()))

作为高速缓存dict的键,但这仅在kwargs中的值是可哈希的时才有效.此外,如以下答案中指出的那样,frozenset不是有序数据结构.因此,此解决方案可能更安全:

as key to the cache dict, but this only works if the values in kwargs are hashable. Furthermore, as pointed out in answers below is that frozenset is not an ordered data structure. Therefore this solution might be safer:

(args, tuple(sorted(kwargs.items())))

但是它仍然不能应付不可散列的元素.我见过的另一种方法是在缓存键中使用kwargsstring表示形式:

But it still cannot cope with un-hashable elements. Another approach I have seen is to use a string representation of the kwargs in the cache key:

(args, str(sorted(kwargs.items())))

我看到的唯一缺点是散列可能很长的字符串的开销.据我所见,结果应该是正确的.有人能发现后一种方法有任何问题吗?以下答案之一指出,对于关键字自变量的值,这假定__str____repr__函数的某些行为.这似乎是个秀场终结者.

The only drawback I see with this is the overhead of hashing a potentially very long string. As far as I can see the results should be correct. Can anyone spot any problems with the latter approach? One of the answers below points out that this assumes certain behaviour of the __str__ or __repr__ functions for the values of the key-word arguments. This seems like a show-stopper.

有没有一种更成熟的方式来实现备忘录,可以处理**kwargs和不可散列的参数值?

Is there another, more established way of achieving memoization that can cope with **kwargs and un-hashable argument values?

推荐答案

key = (args, frozenset(kwargs.items())

这是您无需对数据做任何假设就可以做到的最佳".

This is the "best" you can do without making assumptions about your data.

但是似乎可以对字典执行记忆(虽然有点不寻常),但是如果需要的话,可以特例.例如,您可以在复制字典时递归应用frozenset(---.items()).

However it seems conceivable to want to perform memoization on dictionaries (a bit unusual though), you could special-case that if you desired it. For example you could recursively apply frozenset(---.items()) while copying dictionaries.

如果您执行sorted,则可能会遇到键无法排序的情况.例如,"子集和相等性比较不能归纳为完整的排序函数.例如,任何两个不相交的集合都不相等,也不是彼此的子集,因此以下所有内容均返回False:ab.因此,则集不实现 cmp ()方法."

If you do sorted, you could be in a bad situation where you have unorderable keys. For example, "The subset and equality comparisons do not generalize to a complete ordering function. For example, any two disjoint sets are not equal and are not subsets of each other, so all of the following return False: ab. Accordingly, sets do not implement the cmp() method."

>>> sorted([frozenset({1,2}), frozenset({1,3})])
[frozenset({1, 2}), frozenset({1, 3})]

>>> sorted([frozenset({1,3}), frozenset({1,2})]) # THE SAME
[frozenset({1, 3}), frozenset({1, 2})] # DIFFERENT SORT RESULT

# sorted(stuff) != sorted(reversed(stuff)), if not strictly totally ordered

编辑:Ignacio说:虽然您不能对任意字典使用sorted(),但kwarg将具有str键."这是完全正确的.因此,这对于键来说不是问题,但是如果您(或不太可能代表)依靠某种方式进行排序,则可能需要记住一些值.

edit: Ignacio says "While you can't use sorted() on arbitrary dicts, kwargs will have str keys." This is entirely correct. Thus this is not an issue for keys, though possibly something to keep in mind for values if you (or unlikely repr) are relying on sorting somehow.

关于使用str:

大多数数据可以很好地工作,但攻击者(例如在安全漏洞上下文中)有可能制造冲突.您介意这不是一件容易的事,因为大多数默认repr都使用了许多很好的分组和转义.实际上,我找不到这种碰撞.但是,草率的第三方实施或不完整的repr实现都是有可能的.

It is the case most data will work nicely, but it is possible for an adversary (e.g. in a security-vulnerability context) to craft a collision. It's not easy mind you because most default reprs use lots of good grouping and escape. In fact I was not able to find such a collision. But it is possible with sloppy third-party or incomplete repr implementations.

还请考虑以下事项:如果存储诸如((<map object at 0x1377d50>,), frozenset(...))((<list_iterator object at 0x1377dd0>,<list_iterator object at 0x1377dd0>), frozenset(...))之类的键,则仅通过调用相同的项,缓存就会无限制地增长. (您也许可以使用正则表达式解决此问题...)并且尝试使用生成器会弄乱您正在使用的函数的语义.但是,如果您想记住is样式的相等性,而不是==样式的相等性,那么这可能是理想的行为.

Also consider the following: If you are storing keys like ((<map object at 0x1377d50>,), frozenset(...)) and ((<list_iterator object at 0x1377dd0>,<list_iterator object at 0x1377dd0>), frozenset(...)), your cache will grow unboundedly just by calling the same items. (You could perhaps work around this issue with a regex...) And attempting to consume the generators will mess up the semantics of the function you're using. This may be desired behavior though if you wish to memoize on is-style equality rather than ==-style equality.

在解释器中执行类似str({1:object()})的操作,每次都会在内存中相同位置返回一个对象!我认为这是工作中的垃圾收集器.这将是灾难性的,因为如果您碰巧正在对<some object at 0x???????>进行哈希处理,并且稍后(由于进行垃圾回收)碰巧在相同的内存位置创建了相同类型的对象,则从备注化函数中将得到不正确的结果.如前所述,一种可能真正令人毛骨悚然的解决方法是使用正则表达式检测此类对象.

Also doing something like str({1:object()}) in the interpreter will return an object at the same location in memory each time! I think this is the garbage collector at work. This would be disastrous, because if you happen to be hashing <some object at 0x???????> and you happen to create an object of the same type at the same memory location later on (due to garbage collection), you will get incorrect results from the memoized function. As mentioned, one possibly really hackish workaround is to detect such objects with a regex.

这篇关于如何记忆**变态?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆