如何在Windows的Python中打印ê和其他ASCII中可用的特殊字符 [英] How to print ê and other special characters available in ascii in Python for windows

查看:294
本文介绍了如何在Windows的Python中打印ê和其他ASCII中可用的特殊字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在Python中为Windows打印一个ê.当我在DOS提示符下时,我可以键入alt + 136来获取一个ê,但是,当我尝试在python中为DOS(代码页cp437或chcp 1252至cp1252之后)中执行此操作时,我无法键入alt + 136 ê字符.为什么会这样?

I would like to print an ê in Python for windows. When I am at the DOS prompt I can type alt+136 to get an ê, however when I try to do this in python for DOS (code page cp437 or after chcp 1252 to cp1252) I can't type alt+136 to get the ê character. Why is this?

print(chr(136))可以在cp437代码页下正确打印ê,但是如何打开具有以下字符的unicode文件:

print(chr(136)) correctly prints ê under code page cp437, but how can I open a unicode file with these characters:

Sokal’, L’vivs’ka Oblastâ€
BucureÅŸti, Romania
ง'⌣'

并获取它来打印这些字符,而不是下面的gobbledygook:

and get it to print those characters instead of the below gobbledygook:

>>> import codecs
>>> f = codecs.open("unicode.txt", "r", "utf-8")
>>> f.read()
u"Sokal\xe2\u20ac\u2122, L\xe2\u20ac\u2122vivs\xe2\u20ac\u2122ka Oblast\xe2\u20ac\nBucure\xc5\u0178ti, Romania\n\xe0\xb8\u2021'\
xe2\u0152\xa3'\nThis text should be in \xe2\u20ac\u0153quotes\xe2\u20ac\\x9d.\nBroken text… it’s ?ubberi?c!"

或更糟糕的是:

>>> f = codecs.open("unicode.txt", "r", "utf-8")
>>> print(f.read())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python27\lib\encodings\cp437.py", line 12, in encode
    return codecs.charmap_encode(input,errors,encoding_map)
UnicodeEncodeError: 'charmap' codec can't encode characters in position 6-7: character maps to <undefined>

以下

import codecs
f = codecs.open("unicode.txt", "r", "utf-8")
s = f.read()
print(s.encode('utf8'))

打印

Sokal’, L’vivs’ka Oblastâ€
BucureÅŸti, Romania
ง'⌣'
This text should be in “quotesâ€\x9d.
Broken text&hellip; it&#x2019;s ?ubberi?c!

代替

Sokal’, L’vivs’ka Oblastâ€
BucureÅŸti, Romania
ง'⌣'

我正在使用:

Python 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] on win32

是否有某种方法可以替换unicode字符串中的ê等,而成为êaka chr(136)的可打印ascii版本?

Is there some way of replacing the ê, etc. in the unicode string to rather be the printable ascii version of ê aka chr(136)?

请注意,我的问题与我如何基于原始UTF-8 Unicode创建新的非Unicode扩展的ascii字符串有关,如果存在等效字符,该字符串将不可打印的字符更改为ascii代码页中的字符,或将字符替换为?或类似的东西(如果有的话).

Note that my question relates to how I can create a new non-Unicode extended ascii string based on the original UTF-8 unicode that will change the non-printable characters to characters in the ascii code page if there are equivalent characters available, or to replace the character with a ? or something similar if an equivalent is available.

推荐答案

我看到多个问题,您偶然发现了几个常见的Unicode问题:

I see multiple questions, you've stumbled upon several common Unicode issues:

  • how to type ê? -- Alt+136 should work for cp437. Try Alt+234 for cp1252 (not tested):

>>> u'ê'.encode('cp437')
b'\x88'
>>> int('88', 16)
136
>>> u'ê'.encode('cp1252')
b'\xea'
>>> int('ea', 16)
234

  • 如何在Python中将Unicode打印到Windows控制台?如何解决UnicodeEncodeError: 'charmap' ...异常? -按照链接

  • how to print Unicode to Windows console in Python? How to fix UnicodeEncodeError: 'charmap' ... exception? -- follow the link

    为什么Python控制台显示u'\u20ac'而不是?反过来,如何仅使用可打印的ASCII字符(例如,u'\xea')显示ê Unicode字符? -Python REPL使用sys.displayhook()(可自定义)功能显示Python表达式的结果.它调用 repr() 例如:

    why does Python console display u'\u20ac' instead of ? And in reverse, how to display ê Unicode character using only ascii printable characters e.g., u'\xea'? -- Python REPL uses sys.displayhook() (customizable) function to display the result of Python expression. It calls repr() e.g.:

    >>> print u'ê'
    ê
    >>> print repr(u'ê')
    u'\xea'
    >>> u'ê'
    u'\xea'
    

    u'\xea'是相应Unicode字符串的文本表示形式.您可以将其用作Unicode字符串文字,以便在Python源代码中创建字符串.

    u'\xea' is a text representation of the corresponding Unicode string. You can use it as a Unicode string literal, to create the string in Python source code.

    在您的情况下可能没有必要,但是通常在Windows控制台中输入/显示任意Unicode字符,您可以安装win-unicode-console软件包.

    It might not be necessary in your case but in general to input/display arbitrary Unicode characters in Windows console, you could install win-unicode-console package.

    不相关:print(chr(136))不正确.如果环境使用的字符编码与您的字符编码不兼容,则会产生错误的输出,例如:

    Unrelated: print(chr(136)) is incorrect. It will produce wrong output if the environment uses an incompatible to yours character encoding e.g.:

    >>> print chr(136)
    �
    

    改为打印Unicode:

    Print Unicode instead:

    >>> print unichr(234)
    ê
    

    原因是chr()在Python 2上返回一个字节串.相同的字节可能以不同的字符编码表示不同的字符,这就是为什么在处理文本时应始终使用Unicode的原因.

    The reason is that chr() returns a bytestring on Python 2. The same byte may represent different characters in different character encodings that is why you should always use Unicode if you work with text.

    这篇关于如何在Windows的Python中打印ê和其他ASCII中可用的特殊字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

  • 查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆