Python基于磁盘的字典 [英] Python Disk-Based Dictionary

查看:139
本文介绍了Python基于磁盘的字典的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在运行一些动态编程代码(试图暴力推翻Collat​​z猜想= P),我使用一个dict来存储我已经计算的链的长度。显然,它在某种程度上耗尽了内存。有没有任何简单的方法使用 dict 的一些变体,它会将其自身部分到磁盘,当它的空间不足时?显然,它会比内存中的字典慢,它可能会最终吃我的硬盘空间,但这可能适用于其他不那么无益的问题。

I was running some dynamic programming code (trying to brute-force disprove the Collatz conjecture =P) and I was using a dict to store the lengths of the chains I had already computed. Obviously, it ran out of memory at some point. Is there any easy way to use some variant of a dict which will page parts of itself out to disk when it runs out of room? Obviously it will be slower than an in-memory dict, and it will probably end up eating my hard drive space, but this could apply to other problems that are not so futile.

我意识到基于磁盘的字典几乎是一个数据库,所以我手动实现一个使用sqlite3,但我没有做任何聪明的方式,它一次查找DB中的每个元素。 ..它是大约慢300倍。

I realized that a disk-based dictionary is pretty much a database, so I manually implemented one using sqlite3, but I didn't do it in any smart way and had it look up every element in the DB one at a time... it was about 300x slower.

是最聪明的方式,只是创建我自己的一套dicts,每次只保留一个在内存中,一些有效的方式?

Is the smartest way to just create my own set of dicts, keeping only one in memory at a time, and paging them out in some efficient manner?

推荐答案

磁盘上的磁盘通常使用Berkeley DB或类似的东西 - 几个选项列在 Python数据持久性文档。你可以在内存缓存前面,但我会先测试本机性能;如果操作系统缓存到位,它可能会出现大致相同的情况。

Hash-on-disk is generally addressed with Berkeley DB or something similar - several options are listed in the Python Data Persistence documentation. You can front it with an in-memory cache, but I'd test against native performance first; with operating system caching in place it might come out about the same.

这篇关于Python基于磁盘的字典的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆