如何使用Python 3.4(Windows 8)将utf-8打印到控制台? [英] How to print utf-8 to console with Python 3.4 (Windows 8)?

查看:216
本文介绍了如何使用Python 3.4(Windows 8)将utf-8打印到控制台?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我从来没有把我的头围绕编码和解码unicode到其他格式(utf-8,utf-16,ascii等),但我已经到达了一个混乱和令人沮丧的墙壁。我想要做的是从一个python模块打印utf-8卡符号(♠,♥,♦,♣)到一个Windows控制台。我使用的控制台是git bash,我使用console2作为前端。我已经尝试/阅读了下面的一些方法,迄今没有任何工作。让我知道我在做什么是可能的,正确的方法。

I've never fully wrapped my head around encoding and decoding unicode to other formats (utf-8, utf-16, ascii, etc.) but I've reached a wall thatis both confusing and frustrating. What I'm trying to do is print utf-8 card symbols (♠,♥,♦,♣) from a python module to a windows console. The console that I'm using is git bash and I'm using console2 as a front-end. I've tried/read a number of approaches below and nothing has worked so far. Let me know if what I'm doing is possible and the right way to do it.


  • 确保控制台可以处理utf-8字符。
    这两个测试使我相信控制台不是问题。


  • 尝试相同事情从python模块。

    当我执行.py,这是结果。

  • Attempt the same thing from the python module.
    When I execute the .py, this is the result.

print(u'♠')
UnicodeEncodeError: 'charmap' codec can't encode character '\u2660' in position 0: character maps to <undefined>


  • 尝试编码♠。
    这给了我在utf-8编码的unicode集合,但仍然没有铲形符号。

  • Attempt to encode ♠. This gives me back the unicode set encoded in utf-8, but still no spade symbol.

    text = '♠'
    print(text.encode('utf-8'))
    b'\xe2\x99\xa0'
    


  • 我觉得我错过了一个步骤,或者不了解整个编码/解码过程。我已阅读。最后一页建议将sys.stdout包装到代码中,但文章说使用stdout是不必要的,并指向使用编解码器模块的另一个页面。

    I feel like I'm missing a step or not understanding the whole encode/decode process. I've read this, this, and this. The last of the pages suggests wrapping the sys.stdout into the code but this article says using stdout is unnecessary and points to another page using the codecs module.

    我很困惑!我觉得,关于这个问题的质量文件很难找到,希望有人可以将其清除。任何帮助总是感谢!

    I'm so confused! I feel as thought quality documentation on this subject is hard to find and hopefully someone can clear this up. Any help is always appreciated!

    奥斯汀

    推荐答案


    我想要做的是从一个python模块打印utf-8卡符号(♠,♥,♦,♣)到Windows控制台

    What I'm trying to do is print utf-8 card symbols (♠,♥,♦,♣) from a python module to a windows console

    UTF-8是Unicode字符的字节编码。 ♠♥♦♣是可以以各种编码进行复制的Unicode字符,UTF-8是其中一种编码方式,UTF-8可以再现任何Unicode字符。但是没有具体的UTF-8关于这些角色。

    UTF-8 is a byte encoding of Unicode characters. ♠♥♦♣ are Unicode characters which can be reproduced in a variety of encodings and UTF-8 is one of those encodings—as a UTF, UTF-8 can reproduce any Unicode character. But there is nothing specifically "UTF-8" about those characters.

    其他编码可以重现字符♠♥♦♣是Windows 代码页850 437 ,您的控制台可能在西欧安装的Windows下使用。您可以在这些编码中打印♠,但不使用UTF-8,您将无法使用UTF-8中可用的其他Unicode字符,但不在这些代码页的范围之内。


    Other encodings that can reproduce the characters ♠♥♦♣ are Windows code page 850 and 437, which your console is likely to be using under a Western European install of Windows. You can print ♠ in these encodings but you are not using UTF-8 to do so, and you won't be able to use other Unicode characters that are available in UTF-8 but outside the scope of these code pages.

    print(u'♠')
    UnicodeEncodeError: 'charmap' codec can't encode character '\u2660'
    

    在Python 3中,这与 print('♠') test你做了上述,所以有一些不同的是你如何调用包含这个 print 的脚本,与你的 py相比, 3.4 sys.stdout.encoding 从脚本中提供什么?

    In Python 3 this is the same as the print('♠') test you did above, so there is something different about how you are invoking the script containing this print, compared to your py -3.4. What does sys.stdout.encoding give you from the script?

    获取打印正确工作,您必须确保Python拾取正确的编码。如果没有从终端设置中做到这一点,那么您确实必须将 PYTHONIOENCODING 设置为 cp437

    To get print working correctly you would have to make sure Python picks up the right encoding. If it is not doing that adequately from the terminal settings you would indeed have to set PYTHONIOENCODING to cp437.

    >>> text = '♠'
    >>> print(text.encode('utf-8'))
    b'\xe2\x99\xa0'
    

    print 只能打印Unicode字符串。对于其他类型,包括 encode()方法生成的字节字符串,它会获得文字表示( repr )。 b'\xe2\x99\xa0'是如何编写一个包含UTF-8编码的♠的Python 3字节文字。

    print can only print Unicode strings. For other types including the bytes string that results from the encode() method, it gets the literal representation (repr) of the object. b'\xe2\x99\xa0' is how you would write a Python 3 bytes literal containing a UTF-8 encoded ♠.

    如果你想做的是绕过打印的隐式编码到PYTHONIOENCODING并替换你自己的,你可以明确地这样做:

    If what you want to do is bypass print's implicit encoding to PYTHONIOENCODING and substitute your own, you can do that explicitly:

    >>> import sys
    >>> sys.stdout.buffer.write('♠'.encode('cp437'))
    

    当然,对于不运行代码页437的任何控制台(例如非西欧安装),都会产生错误的输出。一般来说,对于使用C stdio的应用程序(如Python),将非ASCII字符转到Windows控制台是太不可靠的。

    This will of course generate wrong output for any consoles not running code page 437 (eg non-Western-European installs). Generally, for apps using the C stdio, like Python does, getting non-ASCII characters to the Windows console is just too unreliable to bother with.

    这篇关于如何使用Python 3.4(Windows 8)将utf-8打印到控制台?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆