Python 2.x中的字符串使用哪种编码? [英] Which encoding is used for strings in Python 2.x?
问题描述
在python 2.x中用于编码字符串的默认编码是什么?我读过有两种可能的方法来声明字符串。
What is the default encoding used for encoding strings in python 2.x? I've read that there are two possible ways to declare a string.
string = 'this is a string'
unicode_string = u'this is a unicode string'
第二个字符串为Unicode。
第一个字符串的编码是什么?
The second string is in Unicode. What is the encoding of the first string?
推荐答案
按照 Python默认/隐式字符串编码和转换(简述其Py2部分,以最大程度地减少重复):
As per Python default/implicit string encodings and conversions (reciting its Py2 part concisely, to minimize duplication):
实际上,Python 2中有多个独立的默认字符串编码,被其功能的不同部分使用。
There are actually multiple independent "default" string encodings in Python 2, used by different parts of its functionality.
-
解析代码和字符串文字:
-
str
字面量-将包含文件中的原始字节,不进行转码 -
unicode
来自文字-文件中的字节被解码
与文件的源编码 ,默认为ascii
- 带有
unicode_literals
将来,文件中的所有文字均视为Unicode文字
str
from a literal -- will contain raw bytes from the file, no transcoding is doneunicode
from a literal -- the bytes from the file aredecode
'd with the file's "source encoding" which defaults toascii
- with
unicode_literals
future, all literals in the file are treated as Unicode literals
转码/类型转换:
-
str<-> unicode
类型转换和encode
/decode
w / o参数是通过sys.getdefaultencoding()
- 几乎总是
ascii
,因此任何国家字符都会导致UnicodeError
str<->unicode
type conversion andencode
/decode
w/o arguments are done withsys.getdefaultencoding()
- which is
ascii
almost always, so any national characters will cause aUnicodeError
I / O,包括
print
ing:I/O, including
print
ing:-
unicode
-编码
,并带有< file> .encoding
(如果设置),否则隐式转换为str
(具有上述结果) -
str
-写入原始字节到流中,不进行任何代码转换。对于国家字符,终端将根据其区域设置显示不同的字形。
unicode
--encode
'd with<file>.encoding
if set, otherwise implicitly converted tostr
(with the aforementioned result)str
-- raw bytes are written to the stream, no transcoding is done. For national characters, a terminal will show different glyphs depending on its locale settings.
这篇关于Python 2.x中的字符串使用哪种编码?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
- which is
- 几乎总是