如何在泡菜中保存字典 [英] how to save a dictionary in pickle

查看:65
本文介绍了如何在泡菜中保存字典的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用Pickle将字典保存在文件中.保存字典的代码运行没有任何问题,但是当我尝试从Python Shell中的文件中检索字典时,出现了EOF错误:

I'm trying to use Pickle to save a dictionary in a file. The code to save the dictionary runs without any problems, but when I try to retrieve the dictionary from the file in the Python shell, I get an EOF error:

>>> import pprint
>>> pkl_file = open('data.pkl', 'rb')
>>> data1 = pickle.load(pkl_file)
 Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
     File "/usr/lib/python2.7/pickle.py", line 1378, in load
     return Unpickler(file).load()
     File "/usr/lib/python2.7/pickle.py", line 858, in load
      dispatch[key](self)
      File "/usr/lib/python2.7/pickle.py", line 880, in load_eof
      raise EOFError
      EOFError

我的代码在下面.

它计算每个单词的频率和数据的日期(日期是文件名.),然后将单词保存为字典的键,将(freq,date)的元组保存为每个键的值.现在,我想将此字典用作工作另一部分的输入:

It counts the frequency of each word and the date of the data (the date is the file name.) then saves words as keys of dictionary and the tuple of (freq,date) as values of each key. Now I want to use this dictionary as the input of another part of my work :

def pathFilesList():
    source='StemmedDataset'
    retList = []
    for r,d,f in os.walk(source):
        for files in f:
            retList.append(os.path.join(r, files))
    return retList

def parsing():
    fileList = pathFilesList()
    for f in fileList:
        print "Processing file: " + str(f)
        fileWordList = []
        fileWordSet = set()
        fw=codecs.open(f,'r', encoding='utf-8')
        fLines = fw.readlines()
        for line in fLines:
            sWord = line.strip()
            fileWordList.append(sWord)
            if sWord not in fileWordSet:
                fileWordSet.add(sWord)
        for stemWord in fileWordSet:
            stemFreq = fileWordList.count(stemWord)
            if stemWord not in wordDict:
                wordDict[stemWord] = [(f[15:-4], stemFreq)]
            else:
                wordDict[stemWord].append((f[15:-4], stemFreq))
        fw.close()

if __name__ == "__main__":
    parsing()
    output = open('data.pkl', 'wb')
    pickle.dump(wordDict, output)
    output.close()

您认为问题是什么?

推荐答案

由于这是Python2,因此您通常必须更清楚地了解源代码的编写方式.所引用的PEP-0263对此进行了详细说明.我的建议是,您尝试将以下内容添加到unpickle.py

Since this is Python2 you have often have to be more explicit about what encoding your source code is written in. The referenced PEP-0263 explains this in detail. My suggestion is that you try adding the following to the very first two lines of unpickle.py

#!/usr/bin/env python
# -*- coding: utf-8 -*-

# The rest of your code....

顺便说一句,如果您要处理很多非ASCII字符,最好改用Python3.

Btw, if you are going to work a lot with non-ascii characters it might be a good idea to use Python3 instead.

这篇关于如何在泡菜中保存字典的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆