如何分块加载 Pickle 文件? [英] How to load Pickle file in chunks?

查看:87
本文介绍了如何分块加载 Pickle 文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有没有办法分块加载pickle文件?

Is there any option to load a pickle file in chunks?

我知道我们可以将数据保存在 CSV 中并分块加载.但是除了 CSV 之外,是否有任何选项可以以块的形式加载泡菜文件或任何 Python 本机文件?

I know we can save the data in CSV and load it in chunks. But other than CSV, is there any option to load a pickle file or any python native file in chunks?

推荐答案

我有一个类似的问题,我写了一个桶文件描述符池,并注意到当我关闭一个文件描述符时我的泡菜文件被损坏了.尽管您可以对打开的文件描述符执行多个 dump() 操作,但随后无法执行 open('file', 'ab') 以开始保存一组新的对象.

I had a similar issue, where I wrote a barrel file descriptor pool, and noticed that my pickle files were getting corrupt when I closed a file descriptor. Although you may do multiple dump() operations to an open file descriptor, it's not possible to subsequently do an open('file', 'ab') to start saving a new set of objects.

在我不得不关闭文件描述符之前,我通过将 pickler.dump(None) 作为会话终止符来解决这个问题,并且在重新打开时,我实例化了一个新的 Pickler 实例以继续写入文件.

I got around this by doing a pickler.dump(None) as a session terminator right before I had to close the file descriptor, and upon re-opening, I instantiated a new Pickler instance to resume writing to the file.

从该文件加载时,None 对象表示会话结束,此时我使用文件描述符实例化了一个新的 Pickler 实例以继续读取多会话 pickle 文件的其余部分.

When loading from this file, a None object signified an end-of-session, at which point I instantiated a new Pickler instance with the file descriptor to continue reading the remainder of the multi-session pickle file.

这仅适用于由于某种原因您必须关闭文件描述符的情况.否则,可以稍后为 load() 执行任意数量的 dump() 调用.

This only applies if for some reason you have to close the file descriptor, though. Otherwise, any number of dump() calls can be performed for load() later.

这篇关于如何分块加载 Pickle 文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆