哪些文件编码支持Python 3源文件? [英] Which file encodings are supported for Python 3 source files?

查看:410
本文介绍了哪些文件编码支持Python 3源文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在您告诉我阅读 PEP 0263 之前,请保留阅读...

Before you go telling me to read PEP 0263, keep reading...

我找不到任何详细信息的文档哪些文件编码支持Python 3源文件

I can't find any documentation that details which file encodings are supported for Python 3 source files.

我发现了数百(数千)的问题,答案,帖子,电子邮件等关于如何声明 - 在源文件的顶部 - 编码的源文件,但他们都没有回答我的问题。与我一起想象(或者实际尝试)以下内容:

I've found hundreds (thousands?) of questions, answers, posts, emails, etc. about how to declare - at the top of your source file - the encoding of that source file, but none of them answer my question. Bear with me and imagine doing (or actually try) the following:


  1. 打开记事本(我在Windows 7上使用常规的旧记事本,但我怀疑它很重要,我相信你的上级编辑可以做类似的事情。)

  2. 输入你最喜爱的Python代码行(我使用 print('Hello选择文件 - >保存

  3. 选择一个文件夹和文件名(我使用E:\Temp\hello.py)

  4. 将Encoding:设置从默认的ANSI更改为Unicode

  5. 按保存

  6. 打开命令提示符,切换到包含新文件的文件夹,然后尝试运行

  1. Open Notepad (I'm using regular old Notepad on Windows 7, but I doubt it matters; I'm sure your superior editor can do something similar.)
  2. Type your favorite line of Python code ( I used print( 'Hello, world!' ) )
  3. Select "File" -> "Save"
  4. Select a folder and file name ( I used "E:\Temp\hello.py" )
  5. Change the "Encoding:" setting from the default "ANSI" to "Unicode"
  6. Press "Save"
  7. Open a command prompt, change to the folder containing your new file, and try to run it

这是我得到的输出:

E:\Temp>python --version
Python 3.4.1

E:\Temp>python "hello.py"
  File "hello.py", line 1
SyntaxError: Non-UTF-8 code starting with '\xff' in file hello.py on line 1, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details

现在,当我在记事本++ 中打开同一个文件时,看看编码菜单,它有选项选择UCS-2小尾编码。 维基百科告诉我,这基本上是UTF-16编码。随你。我真的不在乎更多的研究表明,我的编辑器在文件的前面插入了一个两字节BOM(字节顺序标记),值为'\xff\xfe',表示文件编码。所以至少我知道Python所抱怨的'\xff'代码来自哪里。

Now, when I open this same file in Notepad++ and look at the "Encoding" menu, it has the option "Encode in UCS-2 Little Endian" selected. Wikipedia tells me that this is basically UTF-16 encoding. Whatever. I don't really care. More research reveals that my editor has inserted a two-byte BOM (Byte Order Mark) with a value of '\xff\xfe' at the front of the file to indicate the file encoding. So at least I know where the '\xff' code that Python is complaining about comes from.

所以我去阅读 PEP 0263 - 以及关于它的一切 - 在网络上,我尝试添加一个这样的评论到第一行文件

So I go and read PEP 0263 - and everything else regarding it - on the web, and I try adding a comment like this to the first line of the file

# coding: utf-16

与编码的各种不同的值,没有任何帮助。但是它不能帮助,对吧?因为Python甚至不能达到我的编码声明;它源于源文件的第一个字节窒息!

with all sorts of different values for the encoding, and nothing helps. But it can't help, right? Because Python isn't even getting as far as my encoding declaration; It's choking on the first byte of the source file!

所以我真正想知道的是...

So what I really want to know is...


  1. 为什么Python 3解释器不能读取这个文件?

  2. 如果是Unicode或UCS-2 Little Endian或 UTF-16或不支持,什么是???

  1. Why can't the Python 3 interpreter read this file?
  2. If "Unicode" or "UCS-2 Little Endian" or "UTF-16" or whatever isn't supported, what is???

我甚至发现 StackOverflow上的另一个问题,这似乎是我遇到的确切问题,但它被关闭 - 错误地我的意见 - 重复。 :(

P.S. I even found another question on StackOverflow which seems to be the exact issue I'm having, but it was closed - erroneously in my opinion - as a duplicate. :(

---编辑---

有人问我的编译选项输出可能会有帮助?

Someone asked for my "compiled options". Here's some output. Maybe it will help?

E:\Temp>python
Python 3.4.1 (v3.4.1:c0e311e010fc, May 18 2014, 10:38:22) [MSC v.1600 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import sysconfig
>>> print( sysconfig.get_config_vars() )
{'EXT_SUFFIX': '.pyd', 'srcdir': 'C:\\Python34', 'py_version_short': '3.4', 'base': 'C:\\Python34', 'prefix': 'C:\\Python34', 'projectbase': 'C:\\Python34', 'INCLUDEPY': 'C:\\Python34\\Include', 'platbase': 'C:\\Python34', 'py_version_nodot': '34', 'exec_prefix': 'C:\\Python34', 'EXE': '.exe', 'installed_base': 'C:\\Python34', 'SO': '.pyd', 'installed_platbase': 'C:\\Python34', 'VERSION': '34', 'BINLIBDEST': 'C:\\Python34\\Lib', 'LIBDEST': 'C:\\Python34\\Lib', 'userbase': 'C:\\Users\\alonghi\\AppData\\Roaming\\Python', 'py_version': '3.4.1', 'abiflags': '', 'BINDIR': 'C:\\Python34'}
>>>


推荐答案

源编码必须是:


  1. 所讨论的Python版本支持的编码。 (这取决于版本和平台,例如,您只能在Windows上获得 mbcs 。)

松散ASCII兼容,足够使用#编码:声明可以使用 ascii 读取,这是任何声明被读取。请参阅PEP0263概念项目1。

Loosely ASCII-compatible, enough that the # coding: declaration can be read using ascii which is the initial source encoding before any declaration is read. See PEP0263 ‘Concepts’ item 1.

Windows误导称为Unicode的编码,UTF-16LE不是ASCII兼容的(通常是一桶你应该尽量避免使用的问题)。 Python将需要特殊的特定编码支持来检测UTF-16源文件,此功能现在已拒绝

The encoding that Windows misleadingly calls "Unicode", UTF-16LE, is not ASCII-compatible (and generally is a barrel of problems you should try to avoid using). Python would need special encoding-specific support to detect UTF-16 source files and this feature has been declined for now.

您应该使用的#编码:几乎总是UTF-8。

The # coding: you should use is almost invariably UTF-8.

这篇关于哪些文件编码支持Python 3源文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆