如何在 Python 3 中转换字节和字符串? [英] How to convert between bytes and strings in Python 3?
问题描述
This is a Python 101 type question, but it had me baffled for a while when I tried to use a package that seemed to convert my string input into bytes.
As you will see below I found the answer for myself, but I felt it was worth recording here because of the time it took me to unearth what was going on. It seems to be generic to Python 3, so I have not referred to the original package I was playing with; it does not seem to be an error (just that the particular package had a .tostring()
method that was clearly not producing what I understood as a string...)
My test program goes like this:
import mangler # spoof package
stringThing = """
<Doc>
<Greeting>Hello World</Greeting>
<Greeting>你好</Greeting>
</Doc>
"""
# print out the input
print('This is the string input:')
print(stringThing)
# now make the string into bytes
bytesThing = mangler.tostring(stringThing) # pseudo-code again
# now print it out
print('
This is the bytes output:')
print(bytesThing)
The output from this code gives this:
This is the string input:
<Doc>
<Greeting>Hello World</Greeting>
<Greeting>你好</Greeting>
</Doc>
This is the bytes output:
b'
<Doc>
<Greeting>Hello World</Greeting>
<Greeting>xe4xbdxa0xe5xa5xbd</Greeting>
</Doc>
'
So, there is a need to be able to convert between bytes and strings, to avoid ending up with non-ascii characters being turned into gobbledegook.
The 'mangler' in the above code sample was doing the equivalent of this:
bytesThing = stringThing.encode(encoding='UTF-8')
There are other ways to write this (notably using bytes(stringThing, encoding='UTF-8')
, but the above syntax makes it obvious what is going on, and also what to do to recover the string:
newStringThing = bytesThing.decode(encoding='UTF-8')
When we do this, the original string is recovered.
Note, using str(bytesThing)
just transcribes all the gobbledegook without converting it back into Unicode, unless you specifically request UTF-8, viz., str(bytesThing, encoding='UTF-8')
. No error is reported if the encoding is not specified.
这篇关于如何在 Python 3 中转换字节和字符串?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!