Python 可哈希字典 [英] Python hashable dicts

查看:49
本文介绍了Python 可哈希字典的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

作为练习,主要是为了我自己的娱乐,我正在实现一个回溯 Packrat 解析器.这样做的灵感是我想更好地了解 hygenic 宏如何在类似 algol 的语言中工作(与您通常在其中找到的无语法 lisp 方言相对应).因此,通过输入的不同传递可能会看到不同的语法,因此缓存的解析结果无效,除非我还将语法的当前版本与缓存的解析结果一起存储.(编辑:使用键值集合的结果是它们应该是不可变的,但我不打算公开接口以允许更改它们,因此无论是可变集合还是不可变集合都很好)

问题是python dicts不能作为其他dicts的键出现.即使使用元组(无论如何我都会这样做)也无济于事.

<预><代码>>>>缓存 = {}>>>规则 = {"foo":"bar"}>>>缓存[(规则,baz")] =quux"回溯(最近一次调用最后一次):文件<stdin>",第 1 行,在 <module> 中类型错误:不可散列的类型:'dict'>>>

我想它必须一直是元组.现在 python 标准库提供了大约我需要的东西,collections.namedtuple 有一个非常不同的语法,但 可以用作一个键.从上面的会话继续:

<预><代码>>>>从集合导入namedtuple>>>Rule = namedtuple("Rule",rule.keys())>>>cache[(Rule(**rule), "baz")] = "quux">>>缓存{(Rule(foo='bar'), 'baz'): 'quux'}

好的.但是我必须为我想要使用的规则中的每个可能的键组合创建一个类,这还不错,因为每个解析规则都确切地知道它使用什么参数,因此可以同时定义该类作为解析规则的函数.

namedtuple 的另一个问题是它们是严格定位的.两个看起来应该不同的元组实际上可能是相同的:

<预><代码>>>>你 = namedtuple("foo",["bar","baz"])>>>me = namedtuple("foo",["bar","quux"])>>>你(bar=1,baz=2) == 我(bar=1,quux=2)真的>>>bob = namedtuple("foo",["baz","bar"])>>>你(bar=1,baz=2) == bob(bar=1,baz=2)错误的

tl'dr:我如何获得可以用作其他 dict 的键的 dict ?

对答案进行了一些修改,这是我正在使用的更完整的解决方案.请注意,这做了一些额外的工作,以使生成的 dicts 出于实际目的模糊地不可变.当然,通过调用 dict.__setitem__(instance, key, value) 来破解它仍然很容易,但我们都是成年人了.

class hashdict(dict):"""可散列的 dict 实现,适合用作进入其他字典.>>>h1 = hashdict({"apples": 1, "bananas":2})>>>h2 = hashdict({"香蕉": 3, "芒果": 5})>>>h1+h2hashdict(苹果=1,香蕉=3,芒果=5)>>>d1 = {}>>>d1[h1] = "沙拉">>>d1[h1]'沙拉'>>>d1[h2]回溯(最近一次调用最后一次):...KeyError: hashdict(bananas=3, mangoes=5)基于答案http://stackoverflow.com/questions/1151658/python-hashable-dicts"""def __key(self):返回元组(排序(self.items()))def __repr__(self):返回 "{0}({1})".format(self.__class__.__name__,", ".join("{0}={1}".format(str(i[0]),repr(i[1])) for i in self.__key()))def __hash__(self):返回哈希(self.__key())def __setitem__(self, key, value):raise TypeError("{0} 不支持项目分配".format(self.__class__.__name__))def __delitem__(self, key):raise TypeError("{0} 不支持项目分配".format(self.__class__.__name__))明确清除(自我):raise TypeError("{0} 不支持项目分配".format(self.__class__.__name__))def pop(self, *args, **kwargs):raise TypeError("{0} 不支持项目分配".format(self.__class__.__name__))def popitem(self, *args, **kwargs):raise TypeError("{0} 不支持项目分配".format(self.__class__.__name__))def setdefault(self, *args, **kwargs):raise TypeError("{0} 不支持项目分配".format(self.__class__.__name__))def 更新(self, *args, **kwargs):raise TypeError("{0} 不支持项目分配".format(self.__class__.__name__))# 更新是不行的,因为它改变了对象# __add__ 没问题,因为它创建了一个新对象# 当新对象正在构建中时,可以对其进行变异def __add__(self, right):结果 = hashdict(self)dict.update(结果,对)返回结果如果 __name__ == "__main__":导入文档测试doctest.testmod()

解决方案

这是制作可散列字典的简单方法.请记住不要在嵌入到另一个字典后出于明显的原因对它们进行变异.

class hashabledict(dict):def __hash__(self):返回哈希(元组(排序(self.items())))

As an exercise, and mostly for my own amusement, I'm implementing a backtracking packrat parser. The inspiration for this is i'd like to have a better idea about how hygenic macros would work in an algol-like language (as apposed to the syntax free lisp dialects you normally find them in). Because of this, different passes through the input might see different grammars, so cached parse results are invalid, unless I also store the current version of the grammar along with the cached parse results. (EDIT: a consequence of this use of key-value collections is that they should be immutable, but I don't intend to expose the interface to allow them to be changed, so either mutable or immutable collections are fine)

The problem is that python dicts cannot appear as keys to other dicts. Even using a tuple (as I'd be doing anyways) doesn't help.

>>> cache = {}
>>> rule = {"foo":"bar"}
>>> cache[(rule, "baz")] = "quux"
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'dict'
>>> 

I guess it has to be tuples all the way down. Now the python standard library provides approximately what i'd need, collections.namedtuple has a very different syntax, but can be used as a key. continuing from above session:

>>> from collections import namedtuple
>>> Rule = namedtuple("Rule",rule.keys())
>>> cache[(Rule(**rule), "baz")] = "quux"
>>> cache
{(Rule(foo='bar'), 'baz'): 'quux'}

Ok. But I have to make a class for each possible combination of keys in the rule I would want to use, which isn't so bad, because each parse rule knows exactly what parameters it uses, so that class can be defined at the same time as the function that parses the rule.

Edit: An additional problem with namedtuples is that they are strictly positional. Two tuples that look like they should be different can in fact be the same:

>>> you = namedtuple("foo",["bar","baz"])
>>> me = namedtuple("foo",["bar","quux"])
>>> you(bar=1,baz=2) == me(bar=1,quux=2)
True
>>> bob = namedtuple("foo",["baz","bar"])
>>> you(bar=1,baz=2) == bob(bar=1,baz=2)
False

tl'dr: How do I get dicts that can be used as keys to other dicts?

Having hacked a bit on the answers, here's the more complete solution I'm using. Note that this does a bit extra work to make the resulting dicts vaguely immutable for practical purposes. Of course it's still quite easy to hack around it by calling dict.__setitem__(instance, key, value) but we're all adults here.

class hashdict(dict):
    """
    hashable dict implementation, suitable for use as a key into
    other dicts.

        >>> h1 = hashdict({"apples": 1, "bananas":2})
        >>> h2 = hashdict({"bananas": 3, "mangoes": 5})
        >>> h1+h2
        hashdict(apples=1, bananas=3, mangoes=5)
        >>> d1 = {}
        >>> d1[h1] = "salad"
        >>> d1[h1]
        'salad'
        >>> d1[h2]
        Traceback (most recent call last):
        ...
        KeyError: hashdict(bananas=3, mangoes=5)

    based on answers from
       http://stackoverflow.com/questions/1151658/python-hashable-dicts

    """
    def __key(self):
        return tuple(sorted(self.items()))
    def __repr__(self):
        return "{0}({1})".format(self.__class__.__name__,
            ", ".join("{0}={1}".format(
                    str(i[0]),repr(i[1])) for i in self.__key()))

    def __hash__(self):
        return hash(self.__key())
    def __setitem__(self, key, value):
        raise TypeError("{0} does not support item assignment"
                         .format(self.__class__.__name__))
    def __delitem__(self, key):
        raise TypeError("{0} does not support item assignment"
                         .format(self.__class__.__name__))
    def clear(self):
        raise TypeError("{0} does not support item assignment"
                         .format(self.__class__.__name__))
    def pop(self, *args, **kwargs):
        raise TypeError("{0} does not support item assignment"
                         .format(self.__class__.__name__))
    def popitem(self, *args, **kwargs):
        raise TypeError("{0} does not support item assignment"
                         .format(self.__class__.__name__))
    def setdefault(self, *args, **kwargs):
        raise TypeError("{0} does not support item assignment"
                         .format(self.__class__.__name__))
    def update(self, *args, **kwargs):
        raise TypeError("{0} does not support item assignment"
                         .format(self.__class__.__name__))
    # update is not ok because it mutates the object
    # __add__ is ok because it creates a new object
    # while the new object is under construction, it's ok to mutate it
    def __add__(self, right):
        result = hashdict(self)
        dict.update(result, right)
        return result

if __name__ == "__main__":
    import doctest
    doctest.testmod()

解决方案

Here is the easy way to make a hashable dictionary. Just remember not to mutate them after embedding in another dictionary for obvious reasons.

class hashabledict(dict):
    def __hash__(self):
        return hash(tuple(sorted(self.items())))

这篇关于Python 可哈希字典的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆