如何避免始终将字节解码为字符串? [英] How to avoid decoding bytes to string all the time?
问题描述
在我的项目中,有些包将值作为字节返回。是否需要设置一些配置或环境变量,所以我不需要再将字节解码为字符串了吗?如果是这样,那是什么?
There are some packages that returns values as bytes in my project. There is some configuration or environment variable to set so I don't need to decode bytes to string ever again? And if so, what is it?
推荐答案
默认情况下,Python 2可以完成您想要的事情。
Python 2 by default can do what you want.
但是让我建议:这不是人们真正想要的,这就是Python 3不会自动执行此操作的原因。
But let me advise: this is NOT what one really wants and that's why Python 3 does not do that automatically.
将字节转换为str,您需要知道字节的编码:
To convert bytes to str, you need to know the coding of the bytes:
s = b.decode(coding)
要将str转换为字节,您还需要知道所需的编码:
To convert str to bytes, you also need to know the desired coding:
b = s.encode(coding)
Python 2假定编码=='ASCII',因此适用于英语/普通ASCII文本,但提出了
Python 2 assumed coding == 'ASCII' and thus worked for english / plain ASCII texts, but raised exceptions at runtime for everything else.
因此,您要做的是:
- 确定是将某些内容处理为文本(在这种情况下,使用str)还是二进制(然后保留字节)
- 尽早解码(在加载后接收字节)
- 处理为str
- 后期编码(保存之前,发送字节)
- decide whether something should be processed as text (in that case you use str) or as binary (then you keep bytes)
- decode early (after loading, receiving the bytes)
- process as str
- encode late (before saving, sending the bytes)
现在utf-8编码最为流行,因此,如果您没有其他要求,请使用它。
Nowadays utf-8 encoding is the most popular, so use that if you have no other requirements.
这篇关于如何避免始终将字节解码为字符串?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!