字符串和字节字符串有什么区别? [英] What is the difference between a string and a byte string?

查看:107
本文介绍了字符串和字节字符串有什么区别?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用一个返回字节字符串的库,我需要将其转换为字符串。

I am working with a library which returns a byte string and I need to convert this to a string.

尽管我不确定有什么区别-如果有的话。

Although I'm not sure what the difference is - if any.

推荐答案

假设使用Python 3(在Python 2中,定义的区别不那么明确)-字符串是字符,即 unicode代码点;这些是一个抽象概念,不能直接存储在磁盘上。毫无疑问,字节字符串是字节序列-可以将 存储在磁盘上的东西。它们之间的映射是 encoding -其中有很多(并且无限可能)-并且您需要知道哪种情况适用于特定情况才能进行转换,因为不同的编码可以将相同的字节映射到不同的字符串:

Assuming Python 3 (in Python 2, this difference is a little less well-defined) - a string is a sequence of characters, ie unicode codepoints; these are an abstract concept, and can't be directly stored on disk. A byte string is a sequence of, unsurprisingly, bytes - things that can be stored on disk. The mapping between them is an encoding - there are quite a lot of these (and infinitely many are possible) - and you need to know which applies in the particular case in order to do the conversion, since a different encoding may map the same bytes to a different string:

>>> b'\xcf\x84o\xcf\x81\xce\xbdo\xcf\x82'.decode('utf-16')
'蓏콯캁澽苏'
>>> b'\xcf\x84o\xcf\x81\xce\xbdo\xcf\x82'.decode('utf-8')
'τoρνoς'

一旦知道要使用哪一个,就可以使用 .decode()字节字符串的方法,从字符串中获取正确的字符串,如上所述。为了完整起见,字符串的 .encode()方法是相反的:

Once you know which one to use, you can use the .decode() method of the byte string to get the right character string from it as above. For completeness, the .encode() method of a character string goes the opposite way:

>>> 'τoρνoς'.encode('utf-8')
b'\xcf\x84o\xcf\x81\xce\xbdo\xcf\x82'

这篇关于字符串和字节字符串有什么区别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆