在Python中保存和加载大型字典的最快方式 [英] Fastest way to save and load a large dictionary in Python

查看:689
本文介绍了在Python中保存和加载大型字典的最快方式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个比较大的字典。我如何知道大小?当我使用 cPickle 保存它,文件的大小将会增加。 400MB。 cPickle 应该比 pickle 快得多,但加载并保存该文件只需要很多时间。我有一个双核笔记本电脑2.6 Ghz与4GB RAM在Linux机器上。有没有人有任何建议,更快地保存和加载字典在python?谢谢

解决方案

使用 protocol = 2选项。默认协议(0)的速度要慢得多,并且在磁盘上产生大得多的文件。



如果您只想使用比内存容纳的更大的字典,一个href =http://docs.python.org/library/shelve.html> shelve 模块是一个很好的快速肮脏的解决方案。它的行为就像一个内存中的dict,而是将其自身存储在磁盘而不是内存中。搁置是基于cPickle,所以一定要把你的协议设置为0以外的任何东西。



数据库的优点如 sqlite 通过cPickle将取决于你的用例。你多久会写数据?你期望读取你写的每个基准的次数?您是否想要搜索您所写的数据,或者一次加载一个数据?



如果您正在进行一次写入,只读并且一次加载一件,一切都使用数据库。如果你正在写一次,读一次,cPickle(除了默认协议= 0之外的任何协议)将很难击败。如果你只想要一个大而持久的话语,请使用搁置。


I have a relatively large dictionary. How do I know the size? well when I save it using cPickle the size of the file will grow approx. 400Mb. cPickle is supposed to be much faster than pickle but loading and saving this file just takes a lot of time. I have a Dual Core laptop 2.6 Ghz with 4GB RAM on a Linux machine. Does anyone have any suggestions for a faster saving and loading of dictionaries in python? thanks

解决方案

Use the protocol=2 option of cPickle. The default protocol (0) is much slower, and produces much larger files on disk.

If you just want to work with a larger dictionary than memory can hold, the shelve module is a good quick-and-dirty solution. It acts like an in-memory dict, but stores itself on disk rather than in memory. shelve is based on cPickle, so be sure to set your protocol to anything other than 0.

The advantages of a database like sqlite over cPickle will depend on your use case. How often will you write data? How many times do you expect to read each datum that you write? Will you ever want to perform a search of the data you write, or load it one piece at a time?

If you're doing write-once, read-many, and loading one piece at a time, by all means use a database. If you're doing write once, read once, cPickle (with any protocol other than the default protocol=0) will be hard to beat. If you just want a large, persistent dict, use shelve.

这篇关于在Python中保存和加载大型字典的最快方式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆