python中的cPickle.load()占用大量内存 [英] cPickle.load() in python consumes a large memory
问题描述
我有一本大字典,其结构如下:
I have a large dictionary whose structure looks like:
dcPaths = {'id_jola_001': CPath instance}
其中CPath是自定义类:
where CPath is a self-defined class:
class CPath(object):
def __init__(self):
# some attributes
self.m_dAvgSpeed = 0.0
...
# a list of CNode instance
self.m_lsNodes = []
其中m_lsNodes是CNode的列表:
where m_lsNodes is a list of CNode:
class CNode(object):
def __init__(self):
# some attributes
self.m_nLoc = 0
# a list of Apps
self.m_lsApps = []
在这里,m_lsApps是CApp的列表,这是另一个自定义类:
Here, m_lsApps is a list of CApp, which is another self-defined class:
class CApp(object):
def __init__(self):
# some attributes
self.m_nCount= 0
self.m_nUpPackets = 0
我通过使用cPickle将此字典序列化:
I serialize this dictionary by using cPickle:
def serialize2File(strFileName, strOutDir, obj):
if len(obj) != 0:
strOutFilePath = "%s%s" % (strOutDir, strFileName)
with open(strOutFilePath, 'w') as hOutFile:
cPickle.dump(obj, hOutFile, protocol=0)
return strOutFilePath
else:
print("Nothing to serialize!")
它工作正常,序列化文件的大小约为6.8GB.但是,当我尝试反序列化此对象时:
It works fine and the size of serialized file is about 6.8GB. However, when I try to deserialize this object:
def deserializeFromFile(strFilePath):
obj = 0
with open(strFilePath) as hFile:
obj = cPickle.load(hFile)
return obj
我发现它消耗了超过90GB的内存,并且需要很长时间.
I find it consumes more than 90GB memory and takes a long time.
- 为什么会这样?
- 有什么方法可以优化此效果吗?
顺便说一句,我正在使用python 2.7.6
BTW, I'm using python 2.7.6
推荐答案
您可以尝试指定泡菜协议;最快的是-1(表示:最新 协议,如果您使用相同的Python版本进行酸洗和酸洗则没问题.
You can try specifying the pickle protocol; fastest is -1 (meaning: latest protocol, no problem if you are pickling and unpickling with the same Python version).
cPickle.dump(obj, file, protocol = -1)
EDIT :
如评论中所述:load
检测协议本身.
EDIT:
As said in the comments: load
detects the protocol itself.
cPickle.load(obj, file)
这篇关于python中的cPickle.load()占用大量内存的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!