在 Python 3 中将字符串转换为字节的最佳方法? [英] Best way to convert string to bytes in Python 3?

查看:32
本文介绍了在 Python 3 中将字符串转换为字节的最佳方法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

似乎有两种不同的方法可以将字符串转换为字节,如 TypeError: 'str' 不支持缓冲区接口

There appear to be two different ways to convert a string to bytes, as seen in the answers to TypeError: 'str' does not support the buffer interface

这些方法中哪个更好或更Pythonic?还是只是个人喜好问题?

Which of these methods would be better or more Pythonic? Or is it just a matter of personal preference?

b = bytes(mystring, 'utf-8')

b = mystring.encode('utf-8')

推荐答案

如果您查看 bytes 的文档,它会将您指向 bytearray:

If you look at the docs for bytes, it points you to bytearray:

bytearray([source[, encoding[, errors]]])

bytearray([source[, encoding[, errors]]])

返回一个新的字节数组.bytearray 类型是范围 0 <= x <; 的可变整数序列.256.它拥有可变序列的大部分常用方法,在Mutable Sequence Types中描述,以及bytes类型拥有的大多数方法,参见Bytes和Byte Array Methods.

Return a new array of bytes. The bytearray type is a mutable sequence of integers in the range 0 <= x < 256. It has most of the usual methods of mutable sequences, described in Mutable Sequence Types, as well as most methods that the bytes type has, see Bytes and Byte Array Methods.

可选的源参数可用于以几种不同的方式初始化数组:

The optional source parameter can be used to initialize the array in a few different ways:

如果是字符串,还必须给出编码(和可选的错误)参数;bytearray() 然后使用 str.encode() 将字符串转换为字节.

如果它是一个整数,则数组将具有该大小并使用空字节进行初始化.

如果是符合buffer接口的对象,会使用该对象的只读buffer来初始化bytes数组.

如果是可迭代的,则它必须是 0 <= x <; 范围内的整数的可迭代256,用作数组的初始内容.

如果没有参数,将创建一个大小为 0 的数组.

所以 bytes 可以做的不仅仅是编码一个字符串.Pythonic 允许您使用任何有意义的源参数类型调用构造函数.

So bytes can do much more than just encode a string. It's Pythonic that it would allow you to call the constructor with any type of source parameter that makes sense.

对于字符串的编码,我认为 some_string.encode(encoding) 比使用构造函数更 Pythonic,因为它是最自我记录的 -- "获取这个字符串并用这种编码"比 bytes(some_string, encoding) 更清晰——使用构造函数时没有显式动词.

For encoding a string, I think that some_string.encode(encoding) is more Pythonic than using the constructor, because it is the most self documenting -- "take this string and encode it with this encoding" is clearer than bytes(some_string, encoding) -- there is no explicit verb when you use the constructor.

我检查了 Python 源代码.如果您使用 CPython 将 unicode 字符串传递给 bytes,它会调用 PyUnicode_AsEncodedString,即encode的实现;所以如果你自己调用 encode,你只是跳过了一个间接层.

I checked the Python source. If you pass a unicode string to bytes using CPython, it calls PyUnicode_AsEncodedString, which is the implementation of encode; so you're just skipping a level of indirection if you call encode yourself.

此外,请参阅 Serdalis 的评论——unicode_string.encode(encoding) 也更加 Pythonic,因为它的逆是 byte_string.decode(encoding) 并且对称性很好.

Also, see Serdalis' comment -- unicode_string.encode(encoding) is also more Pythonic because its inverse is byte_string.decode(encoding) and symmetry is nice.

这篇关于在 Python 3 中将字符串转换为字节的最佳方法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆