序列化Python中的二进制数据 [英] Serializing binary data in Python

查看:657
本文介绍了序列化Python中的二进制数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一些二进制数据是用Python字节字符串数组的形式。

I have some binary data which is in Python in the form of an array of byte strings.

有序列化这个数据,其他语言能读懂?一个可移植的方式

Is there a portable way to serialize this data that other languages could read?

JSON失去,因为我刚刚发现它具有存储二进制数据没有真正的方法;其字符串预计将统一code。

JSON loses because I just found out that it has no real way to store binary data; its strings are expected to be Unicode.

我不希望使用酱菜,因为我不想要的安全风险,并限制其使用其他Python程序。

I don't want to use pickle because I don't want the security risk, and that limits its use to other Python programs.

有何建议?我真的想用一个内置库(或至少有一个是这样的标准蟒蛇分布的一部分)。

Any advice? I would really like to use a builtin library (or at least one that's part of the standard Anaconda distribution).

推荐答案

如果你只需要在字符串中的二进制数据,可以轻松地恢复个体线之间的界限,你可以只把它们写到一个文件直接作为原料字符串。

If you just need the binary data in the strings and can recover the boundaries between the individual strings easily, you could just write them to a file directly, as raw strings.

如果你不能轻易收回字符串界限,JSON似乎是一个不错的选择:

If you can't recover the string boundaries easily, JSON seems like a good option:

a = [b"abc\xf3\x9c\xc6", b"xyz"]
serialised = json.dumps([s.decode("latin1") for s in a])
print [s.encode("latin1") for s in json.loads(serialised)]

将打印

['abc\xf3\x9c\xc6', 'xyz']

这里的窍门是任意的二进制字符串是有效的 LATIN1 ,这样他们就可以永远去coded到统一code和EN codeD回到原来的字符串一次。

The trick here is that arbitrary binary strings are valid latin1, so they can always be decoded to Unicode and encoded back to the original string again.

这篇关于序列化Python中的二进制数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆