将 `defaultdict` 暴露为常规的 `dict` [英] Exposing `defaultdict` as a regular `dict`

查看:27
本文介绍了将 `defaultdict` 暴露为常规的 `dict`的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 defaultdict(set) 在一个非常大的数据结构中填充内部映射.填充后,整个结构(包括映射)将暴露给客户端代码.那时,我不希望任何人修改映射.

而且没有人故意这样做.但有时,客户端代码可能会意外引用不存在的元素.那时,普通字典会引发 KeyError,但由于映射是 defaultdict,它只是在该键上创建一个新元素(空集).这是很难捕捉到的,因为一切都是悄无声息地发生的.但我需要确保这不会发生(语义实际上并没有中断,但映射会增长到一个巨大的规模).

我该怎么办?我可以看到这些选项:

  1. 在当前和未来的客户端代码中查找对映射执行字典查找的所有实例,并将其转换为 mapping.get(k, {}).这太可怕了.

  2. 冻结"defaultdict 数据结构完全初始化后,将其转换为dict.(我知道它并没有真正冻结,但我相信客户端代码实际上不会编写 mapping[k] = v.)不优雅,并且性能受到很大影响.

  3. defaultdict 包装成 dict 接口.什么是优雅的方式来做到这一点?不过,我担心性能损失可能很大(此查找在紧密循环中大量使用).

  4. 子类 defaultdict 并添加一个关闭"所有 defaultdict 功能的方法,让它的行为就像一个常规的 dict.它是上述 3 的变体,但我不确定它是否更快.而且我不知道不依赖于实现细节是否可行.

  5. 在数据结构中使用常规的dict,重写那里的所有代码,首先检查元素是否在字典中,如果不在,则添加它.不好.

解决方案

defaultdict 文档对 default_factory 说:

<块引用>

如果 default_factory 属性为 None,则会引发 KeyError以键为参数的异常.

如果您只是将 defaultdict 的 default_factory 设置为 None 会怎样?例如,

<预><代码>>>>d = defaultdict(int)>>>d['a'] += 1>>>ddefaultdict(, {'a': 1})>>>d.default_factory = 无>>>d['b'] += 2回溯(最近一次调用最后一次):文件<stdin>",第 1 行,在 <module> 中密钥错误:'b'>>>

不确定这是否是最好的方法,但似乎有效.

I am using defaultdict(set) to populate an internal mapping in a very large data structure. After it's populated, the whole structure (including the mapping) is exposed to the client code. At that point, I don't want anyone modifying the mapping.

And nobody does, intentionally. But sometimes, client code may by accident refer to an element that doesn't exist. At that point, a normal dictionary would have raised KeyError, but since the mapping is defaultdict, it simply creates a new element (an empty set) at that key. This is quite hard to catch, since everything happens silently. But I need to ensure this doesn't happen (the semantics actually doesn't break, but the mapping grows to a huge size).

What should I do? I can see these choices:

  1. Find all the instances in current and future client code where a dictionary lookup is performed on the mapping, and convert it to mapping.get(k, {}) instead. This is just terrible.

  2. "Freeze" defaultdict after the data structure is fully initialized, by converting it to dict. (I know it's not really frozen, but I trust client code to not actually write mapping[k] = v.) Inelegant, and a large performance hit.

  3. Wrap defaultdict into a dict interface. What's an elegant way to do that? I'm afraid the performance hit may be huge though (this lookup is heavily used in tight loops).

  4. Subclass defaultdict and add a method that "shuts down" all the defaultdict features, leaving it to behave as if it's a regular dict. It's a variant of 3 above, but I'm not sure if it's any faster. And I don't know if it's doable without relying on the implementation details.

  5. Use regular dict in the data structure, rewriting all the code there to first check if the element is in the dictionary and adding it if it's not. Not good.

解决方案

defaultdict docs say for default_factory:

If the default_factory attribute is None, this raises a KeyError exception with the key as argument.

What if you just set your defaultdict's default_factory to None? E.g.,

>>> d = defaultdict(int)
>>> d['a'] += 1
>>> d
defaultdict(<type 'int'>, {'a': 1})
>>> d.default_factory = None
>>> d['b'] += 2
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: 'b'
>>> 

Not sure if this is the best approach, but seems to work.

这篇关于将 `defaultdict` 暴露为常规的 `dict`的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆