使用Pickle/cPickle达到最大递归深度 [英] Hitting Maximum Recursion Depth Using Pickle / cPickle

查看:150
本文介绍了使用Pickle/cPickle达到最大递归深度的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

背景:我正在使用最小的构建算法构建一个Trie以表示字典.输入列表是430万个utf-8字符串,按字典顺序排序.结果图是非循环的,最大深度为638个节点.我的脚本的第一行通过sys.setrecursionlimit()将递归限制设置为1100.

The background: I'm building a trie to represent a dictionary, using a minimal construction algorithm. The input list is 4.3M utf-8 strings, sorted lexicographically. The resulting graph is acyclic and has a maximum depth of 638 nodes. The first line of my script sets the recursion limit to 1100 via sys.setrecursionlimit().

问题:我希望能够将Trie序列化到磁盘上,因此我可以将其加载到内存中而不必从头开始重建(大约22分钟).我已经尝试了pickle.dump()cPickle.dump()以及文本和二进制协议.每次,我都会得到如下所示的堆栈跟踪:

The problem: I'd like to be able to serialize my trie to disk, so I can load it into memory without having to rebuild from scratch (roughly 22 minutes). I have tried both pickle.dump() and cPickle.dump(), with both the text and binary protocols. Each time, I get a stack-trace that looks like the following:

  File "/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/pickle.py", line 649, in save_dict
    self._batch_setitems(obj.iteritems())
  File "/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/pickle.py", line 663, in _batch_setitems
    save(v)
  File "/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/pickle.py", line 725, in save_inst
    save(stuff)
  File "/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/pickle.py", line 648, in save_dict
    self.memoize(obj)
RuntimeError: maximum recursion depth exceeded

我的数据结构相对简单:trie包含对开始状态的引用,并定义了一些方法. dfa_state包含一个布尔字段,一个字符串字段以及一个从标签到状态的字典映射.

My data structures are relatively simple: trie contains a reference to a start state, and defines some methods. dfa_state contains a boolean field, a string field, and a dictionary mapping from label to state.

我对pickle的内部运作不是很熟悉-我的最大递归深度是否需要大于/等于Trie深度的n倍(大约n倍)?还是这可能是由我不知道的其他原因引起的?

I'm not very familiar with the inner workings of pickle - does my max recursion depth need to be greater/equal n times the depth of the trie for some n? Or could this be caused by something else I'm unaware of?

更新:将递归深度设置为3000并没有帮助,因此这种途径看起来前景不佳.

Update: Setting the recursion depth to 3000 didn't help, so this avenue doesn't look promising.

更新2:你们是对的;假设由于默认递归限制,泡菜将使用较小的嵌套深度,这是我的短视. 10,000个成功.

Update 2: You guys were right; I was being short-sighted in assuming that pickle would use a small nesting depth due to default recursion limitations. 10,000 did the trick.

推荐答案

来自

尝试腌制高度递归的数据结构可能会超出最大递归深度,在这种情况下将引发RuntimeError.您可以使用sys.setrecursionlimit()小心地提高此限制.

Trying to pickle a highly recursive data structure may exceed the maximum recursion depth, a RuntimeError will be raised in this case. You can carefully raise this limit with sys.setrecursionlimit().

尽管您的trie实现可能很简单,但它使用递归,并且在转换为持久性数据结构时可能会导致问题.

Although your trie implementation may be simple, it uses recursion and can lead to issues when converting to a persistent data structure.

我的建议是继续提高递归限制,以查看您正在使用的数据和所使用的trie实现是否有上限.

My recommendation would be continue raising the recursion limit to see if there is an upper bound for the data you are working with and the trie implementation you are using.

除此之外,如果可能,您可以尝试将树实现更改为递归较少",或编写具有内置数据持久性的其他实现(使用pickle和货架).希望有帮助

Other then that, you can try changing your tree implementation to be "less recursive", if possible, or write an additional implementation that has data persistence built-in (use pickles and shelves in your implementation). Hope that helps

这篇关于使用Pickle/cPickle达到最大递归深度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆