使用控制台在Windows XP上以utf8格式打印python进行打印 [英] Getting python to print in UTF8 on Windows XP with the console
问题描述
我想在Windows XP上配置控制台以支持UTF8,并让python检测到并使用它.
I would like to configure my console on Windows XP to support UTF8 and to have python detect that and work with it.
到目前为止,我的尝试:
So far, my attempts:
C:\Documents and Settings\Philippe>C:\Python25\python.exe
Python 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> print u'é'
é
>>> import sys
>>> sys.stdout.encoding
'cp437'
>>> quit()
因此,默认情况下,我在cp437中,而python检测到就很好了.
So, by default I am in cp437 and python detects that just fine.
C:\Documents and Settings\Philippe>chcp 65001
Active code page: 65001
C:\Documents and Settings\Philippe>python
Python 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.stdout.encoding
'cp65001'
>>> print u'é'
C:\Documents and Settings\Philippe>
似乎用UTF8打印现在使python崩溃了...
It seems like printing in UTF8 makes python crash now...
推荐答案
我想在Windows XP上配置控制台以支持UTF8
I would like to configure my console on Windows XP to support UTF8
我认为这不会发生.
65001代码页有错误;一些stdio调用行为不正确并破坏了许多工具.您可以手动将cp65001注册为编码:
The 65001 code page is buggy; some stdio calls behave incorrectly and break many tools. Whilst you can register cp65001 as an encoding manually:
def cp65001(name):
if name.lower()=='cp65001':
return codecs.lookup('utf-8')
codecs.register(cp65001)
,这允许您进入print u'some unicode string'
,不允许您在该Unicode字符串中写入非ASCII字符.当您尝试将非ASCII UTF-8序列直接作为字节字符串直接写入时,会遇到相同的奇数错误(IOError 0等).
and this allows you to print u'some unicode string'
, it doesn't allow you to write non-ASCII characters in that Unicode string. You get the same odd errors (IOError 0 et al) that you do when you try to write non-ASCII UTF-8 sequences directly as byte strings.
不幸的是,UTF-8是Windows下的二等公民. NT的Unicode模型是在UTF-8出现之前制定的,因此,您希望在需要一致的Unicode的任何地方使用每代码单元两个字节的编码(UTF-16,最初为UCS-2).像许多用C的stdio
编写的便携式应用程序和语言(例如Python)一样,使用字节字符串不适合该模型.
Unfortunately UTF-8 is a second-class citizen under Windows. NT's Unicode model was drawn up before UTF-8 existed and consequently you're expected to use two-byte-per-code-unit encodings (UTF-16, originally UCS-2) anywhere you want consistent Unicode. Using byte strings, like many portable apps and languages (such as Python) written with C's stdio
, doesn't fit that model.
然后重写Python以使用Windows Unicode控制台调用(例如WriteConsoleW)而不是可移植的C stdio调用,不能很好地与管道和重定向到文件之类的shell技巧一起使用. (更不用说,您仍然必须从默认的终端字体更改为TTF字体,然后才能看到所有结果……)
And rewriting Python to use the Windows Unicode console calls (like WriteConsoleW) instead of the portable C stdio ones doesn't play well with shell tricks like piping and redirecting to a file. (Not to mention that you still have to change from the default terminal font to a TTF one before you can see the results working at all...)
最终,如果您需要一个命令行,并且该命令行对基于stdio的应用程序具有有效的UTF-8支持,那么最好使用故意支持Windows的Windows控制台替代品,例如Cygwin,Python的IDLE或pywin32. PythonWin.
Ultimately if you need a command line with working UTF-8 support for stdio-based apps, you'd probably be better off using an alternative to the Windows Console that deliberately supports it, such as Cygwin's, or Python's IDLE or pywin32's PythonWin.
这篇关于使用控制台在Windows XP上以utf8格式打印python进行打印的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!