Python 2和3之间的numpy数组的Pickle不兼容 [英] Pickle incompatibility of numpy arrays between Python 2 and 3

查看:393
本文介绍了Python 2和3之间的numpy数组的Pickle不兼容的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用以下程序在Python 3.2中加载链接到此处的MNIST数据集:

I am trying to load the MNIST dataset linked here in Python 3.2 using this program:

import pickle
import gzip
import numpy


with gzip.open('mnist.pkl.gz', 'rb') as f:
    l = list(pickle.load(f))
    print(l)

不幸的是,它给了我错误:

Unfortunately, it gives me the error:

Traceback (most recent call last):
   File "mnist.py", line 7, in <module>
     train_set, valid_set, test_set = pickle.load(f)
UnicodeDecodeError: 'ascii' codec can't decode byte 0x90 in position 614: ordinal not in range(128)

然后我尝试在Python 2.7中解码腌制的文件,然后重新编码.因此,我在Python 2.7中运行了该程序:

I then tried to decode the pickled file in Python 2.7, and re-encode it. So, I ran this program in Python 2.7:

import pickle
import gzip
import numpy


with gzip.open('mnist.pkl.gz', 'rb') as f:
    train_set, valid_set, test_set = pickle.load(f)

    # Printing out the three objects reveals that they are
    # all pairs containing numpy arrays.

    with gzip.open('mnistx.pkl.gz', 'wb') as g:
        pickle.dump(
            (train_set, valid_set, test_set),
            g,
            protocol=2)  # I also tried protocol 0.

它运行无误,所以我在Python 3.2中重新运行了该程序:

It ran without error, so I reran this program in Python 3.2:

import pickle
import gzip
import numpy

# note the filename change
with gzip.open('mnistx.pkl.gz', 'rb') as f:
    l = list(pickle.load(f))
    print(l)

但是,它给了我与以前相同的错误.我如何使它工作?

However, it gave me the same error as before. How do I get this to work?

这是一种加载MNIST数据集的更好方法. /a>

This is a better approach for loading the MNIST dataset.

推荐答案

这似乎有点不兼容.它试图加载一个"binstring"对象,该对象假定为ASCII,而在这种情况下,它是二进制数据.如果这是Python 3取消选取器中的错误,还是numpy对选取器的滥用",我不知道.

This seems like some sort of incompatibility. It's trying to load a "binstring" object, which is assumed to be ASCII, while in this case it is binary data. If this is a bug in the Python 3 unpickler, or a "misuse" of the pickler by numpy, I don't know.

这是一种解决方法,但是我不知道此时数据的意义如何:

Here is something of a workaround, but I don't know how meaningful the data is at this point:

import pickle
import gzip
import numpy

with open('mnist.pkl', 'rb') as f:
    u = pickle._Unpickler(f)
    u.encoding = 'latin1'
    p = u.load()
    print(p)

在Python 2中取消选择它,然后重新选择它只会再次导致相同的问题,因此您需要将其另存为另一种格式.

Unpickling it in Python 2 and then repickling it is only going to create the same problem again, so you need to save it in another format.

这篇关于Python 2和3之间的numpy数组的Pickle不兼容的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆