如何在 Python 3 中pickle 和 unpickle 到可移植字符串 [英] How to pickle and unpickle to portable string in Python 3
问题描述
我需要将 Python3 对象pickle 为一个字符串,我想从 Travis CI 构建中的环境变量中解压该字符串.问题是我似乎找不到在 Python3 中腌制到可移植字符串(unicode)的方法:
I need to pickle a Python3 object to a string which I want to unpickle from an environmental variable in a Travis CI build. The problem is that I can't seem to find a way to pickle to a portable string (unicode) in Python3:
import os, pickle
from my_module import MyPickleableClass
obj = {'cls': MyPickleableClass, 'other_stuf': '(...)'}
pickled = pickle.dumps(obj)
# raises TypeError: str expected, not bytes
os.environ['pickled'] = pickled
# raises UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbb (...)
os.environ['pickled'] = pickled.decode('utf-8')
pickle.loads(os.environ['pickled'])
有没有办法将诸如 datetime.datetime
之类的复杂对象序列化为 unicode 或 Python3 中的其他一些字符串表示形式,我可以将其传输到不同的机器并反序列化?
Is there a way to serialize complex objects like datetime.datetime
to unicode or to some other string representation in Python3 which I can transfer to a different machine and deserialize?
我已经测试了@kindall 建议的解决方案,但是 pickle.dumps(obj, 0).decode()
引发了 UnicodeDecodeError
.尽管如此,base64 方法是有效的,但它需要一个额外的解码/编码步骤.该解决方案适用于 Python2.x 和 Python3.x.
I have tested the solutions suggested by @kindall, but the pickle.dumps(obj, 0).decode()
raises a UnicodeDecodeError
. Nevertheless the base64 approach works but it needed an extra decode/encode step. The solution works on both Python2.x and Python3.x.
# encode returns bytes so it needs to be decoded to string
pickled = pickle.loads(codecs.decode(pickled.encode(), 'base64')).decode()
type(pickled) # <class 'str'>
unpickled = pickle.loads(codecs.decode(pickled.encode(), 'base64'))
推荐答案
pickle.dumps()
产生一个 bytes
对象.期望这些任意字节是有效的 UTF-8 文本(您通过尝试将其解码为 UTF-8 中的字符串所做的假设)是非常乐观的.如果成功了,那就太巧了!
pickle.dumps()
produces a bytes
object. Expecting these arbitrary bytes to be valid UTF-8 text (the assumption you are making by trying to decode it to a string from UTF-8) is pretty optimistic. It'd be a coincidence if it worked!
一种解决方案是使用完全使用 ASCII 字符的旧酸洗协议.这仍然以 bytes
形式出现,但由于它仅是 ASCII 码,因此可以毫无压力地解码为字符串:
One solution is to use the older pickling protocol that uses entirely ASCII characters. This still comes out as bytes
, but since it is ASCII-only it can be decoded to a string without stress:
pickled = pickle.dumps(obj, 0).decode()
您还可以使用其他一些编码方法将二进制腌制对象编码为文本,例如 base64:
You could also use some other encoding method to encode a binary-pickled object to text, such as base64:
import codecs
pickled = codecs.encode(pickle.dumps(obj), "base64").decode()
解码将是:
unpickled = pickle.loads(codecs.decode(pickled.encode(), "base64"))
使用带有协议 0 的 pickle
似乎导致比 base64 编码的二进制 pickles 更短的字符串(并且 abarnert 建议的十六进制编码甚至比 base64 还要大),但我没有严格测试它或任何东西.用您的数据进行测试并查看.
Using pickle
with protocol 0 seems to result in shorter strings than base64-encoding binary pickles (and abarnert's suggestion of hex-encoding is going to be even larger than base64), but I haven't tested it rigorously or anything. Test it with your data and see.
这篇关于如何在 Python 3 中pickle 和 unpickle 到可移植字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!