在Python 3中将字符串转换为字节的最佳方法? [英] Best way to convert string to bytes in Python 3?

查看:228
本文介绍了在Python 3中将字符串转换为字节的最佳方法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

似乎有两种不同的方法可以将字符串转换为字节,如 TypeError:'str'不支持缓冲区接口



这些方法中哪一个会更好或更好Pythonic?

  b = bytes(mystring,'utf-8')

b = mystring.encode('utf-8')


解决方案>

如果你查看 bytes 的文档,它会指向 bytearray


bytearray source [,encoding [,errors]]])



返回一个新的字节数组。 bytearray类型是在0≤x≤1范围内的整数的可变序列,它具有大多数可变序列的常用方法,在Mutable Sequence Types中描述,以及字节类型具有的大多数方法,参见字节和字节数组方法。



可选的source参数可用于以几种不同的方式初始化数组:



如果是字符串,还必须给出编码和可选地,错误)参数; bytearray()然后使用str.encode()将字符串转换为字节。



如果它是一个整数,并将以空字节初始化。



如果它是一个符合缓冲区接口的对象,则该对象的只读缓冲区



如果它是可迭代的,它必须是一个可迭代的整数,范围为0 < x <



没有参数,会创建一个大小为0的数组。


因此 bytes 可以做的不仅仅是编码字符串。它是Pythonic,它将允许你使用任何类型的源参数调用构造函数是有意义的。



对于编码字符串,我认为 some_string.encode(encoding)比使用构造函数更加Pythonic,因为它是最自我记录 - 拿这个字符串并用这个编码进行编码比 bytes(some_string,encoding)更清晰 - 没有明确的动词你使用构造函数。



编辑:我检查了Python源代码。如果你使用CPython传递一个unicode字符串到 bytes ,它会调用 PyUnicode_AsEncodedString ,它是 encode 的实现;所以如果你自己调用 encode ,你只是跳过一个间接级别。



另外,请参见Serdalis的注释 - unicode_string.encode(encoding)也更多Pythonic因为它的逆是 byte_string.decode(encoding)很好。


There appears to be two different ways to convert a string to bytes, as seen in the answers to TypeError: 'str' does not support the buffer interface

Which of these methods would be better or more Pythonic? Or is it just a matter of personal preference?

b = bytes(mystring, 'utf-8')

b = mystring.encode('utf-8')

解决方案

If you look at the docs for bytes, it points you to bytearray:

bytearray([source[, encoding[, errors]]])

Return a new array of bytes. The bytearray type is a mutable sequence of integers in the range 0 <= x < 256. It has most of the usual methods of mutable sequences, described in Mutable Sequence Types, as well as most methods that the bytes type has, see Bytes and Byte Array Methods.

The optional source parameter can be used to initialize the array in a few different ways:

If it is a string, you must also give the encoding (and optionally, errors) parameters; bytearray() then converts the string to bytes using str.encode().

If it is an integer, the array will have that size and will be initialized with null bytes.

If it is an object conforming to the buffer interface, a read-only buffer of the object will be used to initialize the bytes array.

If it is an iterable, it must be an iterable of integers in the range 0 <= x < 256, which are used as the initial contents of the array.

Without an argument, an array of size 0 is created.

So bytes can do much more than just encode a string. It's Pythonic that it would allow you to call the constructor with any type of source parameter that makes sense.

For encoding a string, I think that some_string.encode(encoding) is more Pythonic than using the constructor, because it is the most self documenting -- "take this string and encode it with this encoding" is clearer than bytes(some_string, encoding) -- there is no explicit verb when you use the constructor.

Edit: I checked the Python source. If you pass a unicode string to bytes using CPython, it calls PyUnicode_AsEncodedString, which is the implementation of encode; so you're just skipping a level of indirection if you call encode yourself.

Also, see Serdalis' comment -- unicode_string.encode(encoding) is also more Pythonic because its inverse is byte_string.decode(encoding) and symmetry is nice.

这篇关于在Python 3中将字符串转换为字节的最佳方法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆