python中的cPickle.load()占用大量内存 [英] cPickle.load() in python consumes a large memory

查看:823
本文介绍了python中的cPickle.load()占用大量内存的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一本大字典,其结构如下:

I have a large dictionary whose structure looks like:

dcPaths = {'id_jola_001': CPath instance}

其中CPath是自定义类:

where CPath is a self-defined class:

class CPath(object):
    def __init__(self):
        # some attributes
        self.m_dAvgSpeed = 0.0
        ...
        # a list of CNode instance
        self.m_lsNodes = []

其中m_lsNodes是CNode的列表:

where m_lsNodes is a list of CNode:

class CNode(object):
    def __init__(self):
        # some attributes
        self.m_nLoc = 0

        # a list of Apps
        self.m_lsApps = []

在这里,m_lsApps是CApp的列表,这是另一个自定义类:

Here, m_lsApps is a list of CApp, which is another self-defined class:

class CApp(object):
    def __init__(self):
        # some attributes
        self.m_nCount= 0
        self.m_nUpPackets = 0

我通过使用cPickle将此字典序列化:

I serialize this dictionary by using cPickle:

def serialize2File(strFileName, strOutDir, obj):
    if len(obj) != 0:
        strOutFilePath = "%s%s" % (strOutDir, strFileName)
        with open(strOutFilePath, 'w') as hOutFile:
            cPickle.dump(obj, hOutFile, protocol=0)
        return strOutFilePath
    else:
        print("Nothing to serialize!")

它工作正常,序列化文件的大小约为6.8GB.但是,当我尝试反序列化此对象时:

It works fine and the size of serialized file is about 6.8GB. However, when I try to deserialize this object:

def deserializeFromFile(strFilePath):
    obj = 0
    with open(strFilePath) as hFile:
        obj = cPickle.load(hFile)
    return obj

我发现它消耗了超过90GB的内存,并且需要很长时间.

I find it consumes more than 90GB memory and takes a long time.

  1. 为什么会这样?
  2. 有什么方法可以优化此效果吗?

顺便说一句,我正在使用python 2.7.6

BTW, I'm using python 2.7.6

推荐答案

您可以尝试指定泡菜协议;最快的是-1(表示:最新 协议,如果您使用相同的Python版本进行酸洗和酸洗则没问题.

You can try specifying the pickle protocol; fastest is -1 (meaning: latest protocol, no problem if you are pickling and unpickling with the same Python version).

cPickle.dump(obj, file, protocol = -1)

EDIT : 如评论中所述:load检测协议本身.

EDIT: As said in the comments: load detects the protocol itself.

cPickle.load(obj, file)

这篇关于python中的cPickle.load()占用大量内存的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆