泡菜替代品 [英] Pickle alternatives

查看:78
本文介绍了泡菜替代品的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试序列化一个大的(约 10**6 行,每行有约 20 个值)列表,供我自己稍后使用(所以泡菜缺乏安全性不是问题).

I am trying to serialize a large (~10**6 rows, each with ~20 values) list, to be used later by myself (so pickle's lack of safety isn't a concern).

列表的每一行都是一个值的元组,源自某个 SQL 数据库.到目前为止,我已经看到了 datetime.datetime、字符串、整数和 NoneType,但我最终可能不得不支持其他数据类型.

Each row of the list is a tuple of values, derived from some SQL database. So far, I have seen datetime.datetime, strings, integers, and NoneType, but I might eventually have to support additional data types.

对于序列化,我考虑过pickle(cPickle)、json和纯文本——但只有pickle保存类型信息:json不能序列化datetime.datetime,纯文本有它的明显的缺点.

For serialization, I've considered pickle (cPickle), json, and plain text - but only pickle saves the type information: json can't serialize datetime.datetime, and plain text has its obvious disadvantages.

但是,对于这么大的数据,cPickle 的速度非常慢,我正在寻找更快的替代方案.

However, cPickle is pretty slow for data this large, and I'm looking for a faster alternative.

推荐答案

我认为你应该给 PyTables 一个看.它应该快得离谱,至少比使用 RDBMS 快,因为它非常松散并且没有施加任何读/写限制,而且你可以获得更好的界面来管理你的数据,至少与腌制相比是这样.

I think you should give PyTables a look. It should be ridiculously fast, at least faster than using an RDBMS, since it's very lax and doesn't impose any read/write restrictions, plus you get a better interface for managing your data, at least compared to pickling it.

这篇关于泡菜替代品的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆