Python 2.x中的字符串使用哪种编码? [英] Which encoding is used for strings in Python 2.x?

查看:158
本文介绍了Python 2.x中的字符串使用哪种编码?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在python 2.x中用于编码字符串的默认编码是什么?我读过有两种可能的方法来声明字符串。

What is the default encoding used for encoding strings in python 2.x? I've read that there are two possible ways to declare a string.

string = 'this is a string'
unicode_string = u'this is a unicode string'

第二个字符串为Unicode。
第一个字符串的编码是什么?

The second string is in Unicode. What is the encoding of the first string?

推荐答案

按照 Python默认/隐式字符串编码和转换(简述其Py2部分,以最大程度地减少重复):

As per Python default/implicit string encodings and conversions (reciting its Py2 part concisely, to minimize duplication):

实际上,Python 2中有多个独立的默认字符串编码,被其功能的不同部分使用。

There are actually multiple independent "default" string encodings in Python 2, used by different parts of its functionality.


  • 解析代码和字符串文字:


  • str 字面量-将包含文件中的原始字节,不进行转码

  • unicode 来自文字-文件中的字节被解码与文件的源编码 ,默认为 ascii

  • 带有 unicode_literals 将来,文件中的所有文字均视为Unicode文字

  • str from a literal -- will contain raw bytes from the file, no transcoding is done
  • unicode from a literal -- the bytes from the file are decode'd with the file's "source encoding" which defaults to ascii
  • with unicode_literals future, all literals in the file are treated as Unicode literals

转码/类型转换:


  • str<-> unicode 类型转换和 encode / decode w / o参数是通过 sys.getdefaultencoding()


    • 几乎总是 ascii ,因此任何国家字符都会导致 UnicodeError

    • str<->unicode type conversion and encode/decode w/o arguments are done with sys.getdefaultencoding()
      • which is ascii almost always, so any national characters will cause a UnicodeError

      I / O,包括 print ing:

      I/O, including printing:


      • unicode -编码,并带有< file> .encoding (如果设置),否则隐式转换为 str (具有上述结果)

      • str -写入原始字节到流中,不进行任何代码转换。对于国家字符,终端将根据其区域设置显示不同的字形。

      • unicode -- encode'd with <file>.encoding if set, otherwise implicitly converted to str (with the aforementioned result)
      • str -- raw bytes are written to the stream, no transcoding is done. For national characters, a terminal will show different glyphs depending on its locale settings.

      这篇关于Python 2.x中的字符串使用哪种编码?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆