使用 Pickle/cPickle 达到最大递归深度 [英] Hitting Maximum Recursion Depth Using Pickle / cPickle

查看:21
本文介绍了使用 Pickle/cPickle 达到最大递归深度的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

背景:我正在使用最小构造算法构建一个代表字典的尝试.输入列表为 4.3M utf-8 字符串,按字典顺序排序.结果图是无环的,最大深度为 638 个节点.我的脚本的第一行通过 sys.setrecursionlimit() 将递归限制设置为 1100.

The background: I'm building a trie to represent a dictionary, using a minimal construction algorithm. The input list is 4.3M utf-8 strings, sorted lexicographically. The resulting graph is acyclic and has a maximum depth of 638 nodes. The first line of my script sets the recursion limit to 1100 via sys.setrecursionlimit().

问题:我希望能够将我的尝试序列化到磁盘,这样我就可以将它加载到内存中而无需从头开始重建(大约 22 分钟).我已经尝试了 pickle.dump()cPickle.dump() 两种文本和二进制协议.每次,我都会得到如下所示的堆栈跟踪:

The problem: I'd like to be able to serialize my trie to disk, so I can load it into memory without having to rebuild from scratch (roughly 22 minutes). I have tried both pickle.dump() and cPickle.dump(), with both the text and binary protocols. Each time, I get a stack-trace that looks like the following:

  File "/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/pickle.py", line 649, in save_dict
    self._batch_setitems(obj.iteritems())
  File "/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/pickle.py", line 663, in _batch_setitems
    save(v)
  File "/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/pickle.py", line 725, in save_inst
    save(stuff)
  File "/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/pickle.py", line 648, in save_dict
    self.memoize(obj)
RuntimeError: maximum recursion depth exceeded

我的数据结构相对简单:trie 包含对开始状态的引用,并定义了一些方法.dfa_state 包含一个布尔字段、一个字符串字段和一个从标签到状态的字典映射.

My data structures are relatively simple: trie contains a reference to a start state, and defines some methods. dfa_state contains a boolean field, a string field, and a dictionary mapping from label to state.

我对 pickle 的内部工作不是很熟悉 - 对于某些 n,我的最大递归深度是否需要大于/等于 trie 的深度的 n 倍?或者这可能是由我不知道的其他原因引起的?

I'm not very familiar with the inner workings of pickle - does my max recursion depth need to be greater/equal n times the depth of the trie for some n? Or could this be caused by something else I'm unaware of?

更新:将递归深度设置为 3000 无济于事,因此这条途径看起来并不乐观.

Update: Setting the recursion depth to 3000 didn't help, so this avenue doesn't look promising.

更新 2:你们是对的;由于默认递归限制,我认为pickle会使用小的嵌套深度是短视的.10,000 成功了.

Update 2: You guys were right; I was being short-sighted in assuming that pickle would use a small nesting depth due to default recursion limitations. 10,000 did the trick.

推荐答案

来自 文档:

尝试pickle一个高度递归的数据结构可能会超过最大递归深度,在这种情况下会引发一个RuntimeError.您可以使用 sys.setrecursionlimit() 小心地提高此限制.

Trying to pickle a highly recursive data structure may exceed the maximum recursion depth, a RuntimeError will be raised in this case. You can carefully raise this limit with sys.setrecursionlimit().

尽管您的 trie 实现可能很简单,但它使用递归并且在转换为持久数据结构时可能会导致问题.

Although your trie implementation may be simple, it uses recursion and can lead to issues when converting to a persistent data structure.

我的建议是继续提高递归限制,以查看您正在使用的数据和您正在使用的 trie 实现是否存在上限.

My recommendation would be continue raising the recursion limit to see if there is an upper bound for the data you are working with and the trie implementation you are using.

除此之外,如果可能,您可以尝试将树实现更改为较少递归",或者编写附加实现,内置数据持久性(使用pickles和shelves 在您的实现中).希望有帮助

Other then that, you can try changing your tree implementation to be "less recursive", if possible, or write an additional implementation that has data persistence built-in (use pickles and shelves in your implementation). Hope that helps

这篇关于使用 Pickle/cPickle 达到最大递归深度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆