将字典保存到文件(numpy和Python 2/3友好) [英] Saving dictionaries to file (numpy and Python 2/3 friendly)

查看:169
本文介绍了将字典保存到文件(numpy和Python 2/3友好)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在Python中进行分层键值存储,这基本上可以归结为将字典存储到文件中.我的意思是任何类型的字典结构,都可能包含其他字典,numpy数组,可序列化的Python对象等等.不仅如此,我还希望它存储经过空间优化的numpy数组,并在Python 2和3之间很好地发挥作用.

I want to do hierarchical key-value storage in Python, which basically boils down to storing dictionaries to files. By that I mean any type of dictionary structure, that may contain other dictionaries, numpy arrays, serializable Python objects, and so forth. Not only that, I want it to store numpy arrays space-optimized and play nice between Python 2 and 3.

下面是我知道的方法.我的问题是此列表中缺少什么,还有没有其他方法可以躲避我所有的违规者?

Below are methods I know are out there. My question is what is missing from this list and is there an alternative that dodges all my deal-breakers?

  • Python的pickle模块(打破常规:使numpy数组的大小膨胀很多)
  • Numpy的save/savez/load(交易突破:Python 2/3中的格式不兼容)
  • 用numpy.savez代替PyTables (破坏者:仅处理numpy数组)
  • 手动使用PyTables(交易突破:我希望它用于不断变化的研究代码,因此能够通过调用单个函数将字典转储到文件中确实很方便)
  • Python's pickle module (deal-breaker: inflates the size of numpy arrays a lot)
  • Numpy's save/savez/load (deal-breaker: Incompatible format across Python 2/3)
  • PyTables replacement for numpy.savez (deal-breaker: only handles numpy arrays)
  • Using PyTables manually (deal-breaker: I want this for constantly changing research code, so it's really convenient to be able to dump dictionaries to files by calling a single function)

numpy.savez的PyTables替换很有希望,因为我喜欢使用hdf5的想法,并且它确实有效地压缩了numpy数组,这是一大优势.但是,它不需要任何类型的字典结构.

The PyTables replacement of numpy.savez is promising, since I like the idea of using hdf5 and it compresses the numpy arrays really efficiently, which is a big plus. However, it does not take any type of dictionary structure.

最近,我一直在使用与PyTables替换类似的东西,但是对其进行了增强,使其能够存储任何类型的条目.这实际上工作得很好,但是我发现自己将原始数据类型存储在length-1的CArrays中,尽管我将chunksize设置为1,所以它有点尴尬(与实际的length-1数组不符),所以占用那么多空间.

Lately, what I've been doing is to use something similar to the PyTables replacement, but enhancing it to be able to store any type of entries. This actually works pretty well, but I find myself storing primitive data types in length-1 CArrays, which is a bit awkward (and ambiguous to actual length-1 arrays), even though I set chunksize to 1 so it doesn't take up that much space.

那里已经有类似的东西了吗?

Is there something like that already out there?

谢谢!

推荐答案

两年前问了这个问题之后,我开始编写自己的基于HDF5的pickle/np.save替代代码.从那时起,它已经发展成为一个稳定的程序包,所以我想我将最终回答并接受自己的问题,因为它完全是我所要寻找的:

After asking this two years ago, I starting coding my own HDF5-based replacement of pickle/np.save. Ever since, it has matured into a stable package, so I thought I would finally answer and accept my own question because it is by design exactly what I was looking for:

这篇关于将字典保存到文件(numpy和Python 2/3友好)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆