IPython终端使用哪种字符编码? [英] Which character encoding is the IPython terminal using?

查看:185
本文介绍了IPython终端使用哪种字符编码?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我曾经认为我有这整个编码的东西很明白。我似乎错了,因为我无法解释这里发生了什么。

I used to think I had this whole encoding stuff pretty figured out. I seem to be wrong because I can't explain what's happening here.

我试图做的是使用 制表 模块,使用

What I was trying to do is to use the tabulate module to print a nicely formatted table using

from tabulate import tabulate
s = tabulate([[1,2],[3,4]], ["x","y"], tablefmt="fancy_grid")
print(s)

在IPython 3.5.0的交互式控制台中在Windows 10下。我预计结果是

in IPython 3.5.0's interactive console under Windows 10. I expected the result to be

╒═════╤═════╕
│   x │   y │
╞═════╪═════╡
│   1 │   2 │
├─────┼─────┤
│   3 │   4 │
╘═════╧═════╛

但相反,我得到了一个

UnicodeEncodeError: 'charmap' codec can't encode character '\u2552' in position 0: character maps to <undefined>

困惑,我试图找出问题所在并查看字符串的repr

Puzzled, I tried to find out where the problem was and looked at the repr of the string:

In [15]: s
Out[15]: '╒═════╤═════╕\n│   x │   y │\n╞═════╪═════╡\n│   1 │   2 │\n├─────┼─────┤\n│   3 │   4 │\n╘═════╧═════╛'

嗯,所有字符都可以显示终端(即使是第一个触发错误的字符)。

Hmm, all the characters can be displayed by the terminal (even the first one that triggered the error).

只需查看一些细节:

In [16]: sys.stdout.encoding
Out[16]: 'cp850'

In [17]: s.encode("cp850")
[...]
UnicodeEncodeError: 'charmap' codec can't encode character '\u2552' in position 0: character maps to <undefined>

那么终端使用哪种编码 ? Python说它是 cp850 ,它告诉我 cp850 没有字符(这是真的,它是<$中的一个字符c $ c> cp437 必须为重音字母腾出空间),但我可以在终端窗口中看到它!

So which encoding is the terminal using? Python says that it's cp850, and it tells me that cp850 doesn't have a character (which is true, it's one of the characters from cp437 that had to make room for accented letters), but I can see it in the terminal window!

为了进一步复杂化,当使用原生Python控制台而不是IPython时,错误似乎更容易理解:

To complicate things further, when using the native Python console instead of IPython, the error seems more understandable:

>>> s
'\u2552═══\u2564═══\u2555\n│ 1 │ 2 │\n├───┼───┤\n│ 3 │ 4 │\n\u2558═══\u2567═══\u255b'
>>> sys.stdout.encoding
'cp850'
>>> print(s)
Traceback (most recent call last):
[...]
UnicodeEncodeError: 'charmap' codec can't encode character '\u2552' in position 0: character maps to <undefined>

所以至少Python是一致的,但是IPython发生了什么?

So at least Python is consistent, but what's happening with IPython?

推荐答案

IPython在交互模式下使用OEM代码页,就像任何其他Python控制台程序一样:

IPython uses OEM code page in the interactive mode like any other Python console program:

In [1]: '\u2552'
ERROR - failed to write data to stream: <_io.TextIOWrapper name='<stdout>' mode=
'w' encoding='cp850'>
Out[1]:

In [2]: !chcp
Active code page: 850

如果安装了 pyreadline ,结果会发生变化(它在IPython控制台中启用颜色等):

The result changes if pyreadline is installed (it enables colors in the IPython console among other things):

In [1]: '\u2552'
Out[1]: '╒'

In [2]: import sys

In [3]: sys.stdout.encoding
Out[3]: 'cp850'

In [4]: !chcp
Active code page: 850

一旦 pyreadline ,IPython的 sys.displayhook 将结果写入使用 WriteConsoleW()的readline的控制台对象Windows Unicode API允许在当前代码页中打印甚至不可编码的Unicode字符(要查看它们,您可能需要在Windows控制台中配置(TrueType)字体,如Lucida Console)。

Once pyreadline has been installed, IPython's sys.displayhook writes the result to readline's console object that uses WriteConsoleW() Windows Unicode API that allows to print even unencodable in the current code page Unicode characters (to see them, you might need to configure a (TrueType) font such as Lucida Console in the Windows console).

这篇关于IPython终端使用哪种字符编码?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆