使用str()将字节转换为字符串会返回带有语音标记的字符串 [英] Converting bytes to string with str() returns string with speech marks
问题描述
说我有一个包含字节的变量:
Say I have a variable containing bytes:
>>> a = b'Hello World'
可以通过以下方式验证:
It can be verified with:
>>> type(a)
<class 'bytes'>
现在我尝试使用 str()将a转换为字符串 code>:
Now I try and convert a into a string with str()
:
>>> b = str(a)
并确保它是字符串:
>>> type(b)
<class 'str'>
现在我尝试打印 b
,但是我得到完全意外的结果:
Now I try and print b
but I get a totally unexpected result:
>>> print(b)
b'Hello World'
它返回一个字符串,就像我将期望,但也保留 b
(字节符号)和'
(引号)。
It returns a string, as I would expect but also it keeps the b
(byte symbol) and the '
(quotation marks).
为什么这样做,而不仅仅是在引号之间打印消息?
Why does it do this, and not just print the message between the quotation marks?
推荐答案
不要将 bytes
的值视为某些默认8位编码的字符串。只是二进制数据。这样, str(a)
返回一个与编码无关的字符串,以表示字节字符串的值。如果要 Hello World
,请具体说明并解码该值。
Don't think of a bytes
value as a string in some default 8-bit encoding. It's just binary data. As such, str(a)
returns an encoding-agnostic string to represent the value of the byte string. If you want 'Hello World'
, be specific and decode the value.
>>> b = a.decode()
>>> type(b)
>>> str
>>> print(b)
Hello World
在Python 2中,字节和文本之间的区别是模糊。 Python 3竭尽全力将两者分开: bytes
用于二进制数据,而 str
用于可读文本。
In Python 2, the distinction between bytes and text was blurred. Python 3 went to great lengths to separate the two: bytes
for binary data, and str
for readable text.
从另一个角度来说,比较
For another perspective, compare
>>> list("Hello")
['H', 'e', 'l', 'l', 'o']
与
>>> list(b"Hello")
[72, 101, 108, 108, 111]
这篇关于使用str()将字节转换为字符串会返回带有语音标记的字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!