python IDLE中unicode框图字符的输出不一致 [英] Inconsistent output of unicode box-drawing characters in python IDLE

查看:228
本文介绍了python IDLE中unicode框图字符的输出不一致的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下代码:

# -*- coding: utf-8 -*-
print "╔╤╤╦╤╤╦╤╤╗"
print "╠╪╪╬╪╪╬╪╪╣"
print "╟┼┼╫┼┼╫┼┼╢"
print "╚╧╧╩╧╧╩╧╧╝"
print "║"
print "│"

$ b b

并且由于某种原因,只有第三行(╚╧╧╩╧╧╩╧╧╝)实际输出正确,其余的是符号的奇数组合。我认为这是由于一些编码问题。 IDLE的完整输出如下:

and for some reason, only the third line (╚╧╧╩╧╧╩╧╧╝) actually outputs properly, the rest is an odd combination of symbols. I assume this is due to some encoding issues. The full output in IDLE is as follows:

â•"╤╤╦╤╤╦╤╤╗
╠╪╪╬╪╪╬╪╪╣
â•Ÿâ"¼â"¼â•«â"¼â"¼â•«â"¼â"¼â•¢
╚╧╧╩╧╧╩╧╧╝
â•‘
â"‚

这是什么原因,我如何解决这个问题?我使用的平板电脑(Surface Pro 3 with Win10)只有一个触摸键盘,所以任何解决方案与最少的打字(特别是打字奇怪的字符)将是理想的,但显然所有的帮助是赞赏。

What is causing this and how can I fix this? I'm using a tablet (Surface Pro 3 with Win10) with only a touch keyboard, so any solution with the least amount of typing (especially typing out weird characters) would be ideal, but obviously all help is appreciated.

推荐答案

Mojibake表示编码在一个编码中的文本显示为另一种不兼容的编码:

Mojibake indicates that the text encoded in one encoding is shown in another incompatible encoding:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
print(u"╔╤╤╦╤╤╦╤╤╗".encode('utf-8').decode('cp1252')) #XXX: DON'T DO IT
# -> â•"╤╤╦╤╤╦╤╤╗

$ b?
$ b

有几个地方可能使用错误的编码。

There are several places where the wrong encoding could be used.

#coding:utf-8 编码声明说明您的源代码(例如,字符串文字)中的非ascii字符应如何解释。如果 print u╔╤╤╦╤╤╦╤╤╗在你的情况下工作,那么这意味着源代码本身被正确解码为Unicode。对于调试,您只能使用ascii字符来写入字符串: u'\\\╔\\\╗'==u'╔╗'

# coding: utf-8 encoding declaration says how non-ascii characters in your source code (e.g., inside string literals) should be interpreted. If print u"╔╤╤╦╤╤╦╤╤╗" works in your case then it means that the source code itself is decoded to Unicode correctly. For debugging, you could write the string using only ascii characters: u'\u2554\u2557' == u'╔╗'.

打印╔╤╤╦╤╤╦╤╤╗(请勿使用)打印字节(使用utf-8在这种情况下)。 IDLE本身使用Unicode(BMP)。字节必须解码为Unicode文本,然后才能在IDLE中显示。看起来IDLE使用ANSI代码页如 cp1252 locale.getpreferredencoding(False))来解码输出字节视窗。不要以字节打印文本。它将在任何使用不同于您的源代码的字符编码的环境中失败,例如,如果在Windows控制台中运行来自问题的代码,您会得到ΓòöΓòù... mojibake使用cp437 OEM代码页。

print "╔╤╤╦╤╤╦╤╤╗" (DON'T DO IT) prints bytes (text encoded using utf-8 in this case) as is. IDLE itself works with Unicode (BMP). The bytes must be decoded into Unicode text before they can be shown in IDLE. It seems IDLE uses ANSI code page such as cp1252 (locale.getpreferredencoding(False)) to decode the output bytes on Windows. Don't print text as bytes. It will fail in any environment that uses a character encoding different from your source code e.g., you would get ΓòöΓòù... mojibake if you run the code from the question in Windows console that uses cp437 OEM code page.

您应该对程序中的所有文本使用Unicode。 Python 3甚至禁止在字节文本中的非ASCII字符。您会得到 SyntaxError

You should use Unicode for all text in your program. Python 3 even forbids non-ascii characters inside a bytes literal. You would get SyntaxError there.

print(u'\\\╔\\\╗ ')可能会失败, UnicodeEncodeError 如果您将在Windows控制台中运行代码,并且代码页如cp437不能代表字符。 要在Windows控制台中打印任意Unicode字符,请使用 win-unicode-console package 。如果您使用IDLE,则不需要它。

print(u'\u2554\u2557') might fail with UnicodeEncodeError if you would run the code in Windows console and OEM code page such as cp437 weren't be able to represent the characters. To print arbitrary Unicode characters in Windows console, use win-unicode-console package. You don't need it if you use IDLE.

这篇关于python IDLE中unicode框图字符的输出不一致的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆