用python 3解开python 2对象 [英] Unpickling a python 2 object with python 3
问题描述
我想知道是否可以使用Python 3.4加载在Python 2.4中腌制的对象.
I'm wondering if there is a way to load an object that was pickled in Python 2.4, with Python 3.4.
我一直在大量公司遗留代码上运行2to3,以使其保持最新状态.
I've been running 2to3 on a large amount of company legacy code to get it up to date.
完成此操作后,在运行文件时出现以下错误:
Having done this, when running the file I get the following error:
File "H:\fixers - 3.4\addressfixer - 3.4\trunk\lib\address\address_generic.py"
, line 382, in read_ref_files
d = pickle.load(open(mshelffile, 'rb'))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 1: ordinal
not in range(128)
在争用中查看腌制的对象,它是dict
中的dict
,其中包含键和类型为str
的值.
Looking at the pickled object in contention, it's a dict
in a dict
, containing keys and values of type str
.
所以我的问题是:有没有办法用python 3.4加载最初在python 2.4中腌制的对象?
So my question is: Is there a way to load an object, originally pickled in python 2.4, with python 3.4?
推荐答案
您必须告诉pickle.load()
如何将Python字节串数据转换为Python 3字符串,或者您可以告诉pickle
将其保留为字节.
You'll have to tell pickle.load()
how to convert Python bytestring data to Python 3 strings, or you can tell pickle
to leave them as bytes.
默认值是尝试将所有字符串数据解码为ASCII,并且解码失败.请参见 pickle.load()
文档:
The default is to try and decode all string data as ASCII, and that decoding fails. See the pickle.load()
documentation:
可选的关键字参数是 fix_imports , encoding 和 errors ,它们用于控制对Python 2生成的pickle流的兼容性支持. fix_imports 是正确的,pickle会尝试将旧的Python 2名称映射到Python 3中使用的新名称. encoding 和 errors 告诉pickle如何解码Python 2腌制的8位字符串实例;它们分别默认为"ASCII"和"strict". encoding 可以是字节",以将这些8位字符串实例读取为字节对象.
Optional keyword arguments are fix_imports, encoding and errors, which are used to control compatibility support for pickle stream generated by Python 2. If fix_imports is true, pickle will try to map the old Python 2 names to the new names used in Python 3. The encoding and errors tell pickle how to decode 8-bit string instances pickled by Python 2; these default to ‘ASCII’ and ‘strict’, respectively. The encoding can be ‘bytes’ to read these 8-bit string instances as bytes objects.
将编码设置为latin1
允许您直接导入数据:
Setting the encoding to latin1
allows you to import the data directly:
with open(mshelffile, 'rb') as f:
d = pickle.load(f, encoding='latin1')
,但是您需要确认没有使用错误的编解码器对所有字符串进行解码; Latin-1可用于任何输入,因为它将字节值0-255直接映射到前256个Unicode代码点.
but you'll need to verify that none of your strings are decoded using the wrong codec; Latin-1 works for any input as it maps the byte values 0-255 to the first 256 Unicode codepoints directly.
另一种选择是使用encoding='bytes'
加载数据,然后解码所有bytes
键和值.
The alternative would be to load the data with encoding='bytes'
, and decode all bytes
keys and values afterwards.
请注意,直到3.6.8、3.7.2和3.8.0之前的Python版本,解开Python 2 datetime
对象数据已损坏,除非您使用encoding='bytes'
.
Note that up to Python versions before 3.6.8, 3.7.2 and 3.8.0, unpickling of Python 2 datetime
object data is broken unless you use encoding='bytes'
.
这篇关于用python 3解开python 2对象的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!