在python中打印unicode字符串,无论环境如何 [英] Print unicode string in python regardless of environment

查看:201
本文介绍了在python中打印unicode字符串,无论环境如何的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



需要的是它必须在python 2.7和3中运行。 x,在任何平台上,以及任何终端设置和环境变量(例如LANG = C或LANG = en_US.UTF-8)。



python打印功能自动尝试要编码到终端编码时打印,但如果终端编码是ascii它失败。



例如,当环境LANG = enUS.UTF- 8:

  x = u'\xea'
print(x)
  UnicodeEncodeError:'ascii'编解码器不能将字符u'\xea'编码在位置0:序号不在范围内(128)
pre>

以下功能无论LANG设置如何,但如果终端使用不同的unicode,则无法正确显示unicode字符编码:

  print(x.encode('utf-8'))

所需的行为将是终端中始终显示unicode,如果可能的话,如果终端不支持unicode,则显示一些编码。例如,如果终端只支持ascii,输出将被UTF-8编码。基本上,目标是在python打印功能工作时做同样的事情,但在打印功能失败的情况下,使用一些默认编码。

解决方案

您可以通过将 sys.stdout 命名为默认值来处理 LANG = C 到UTF-8,否则默认为ASCII。

  import sys,codecs 

如果sys.stdout.encoding为None或sys.stdout.encoding =='ANSI_X3.4-1968':
utf8_writer = codecs.getwriter('UTF-8')
如果sys.version_info.major < 3:
sys.stdout = utf8_writer(sys.stdout,errors ='replace')
else:
sys.stdout = utf8_writer(sys.stdout.buffer,errors ='replace')

print(u'\N {snowman}')

以上代码片段满足您的要求:它适用于Python 2.7和3.4,并且在 LANG 处于非UTF-8设置(例如 C



它是,但在文档中很难找到。如上所述,它实际上尊重非UTF-8设置,例如 ISO 8859 - * 。如果Python将伪造默认为ASCII,打破应用程序,则它仅默认为UTF-8。


I'm trying to find a generic solution to print unicode strings from a python script.

The requirements are that it must run in both python 2.7 and 3.x, on any platform, and with any terminal settings and environment variables (e.g. LANG=C or LANG=en_US.UTF-8).

The python print function automatically tries to encode to the terminal encoding when printing, but if the terminal encoding is ascii it fails.

For example, the following works when the environment "LANG=enUS.UTF-8":

x = u'\xea'
print(x)

But it fails in python 2.7 when "LANG=C":

UnicodeEncodeError: 'ascii' codec can't encode character u'\xea' in position 0: ordinal not in range(128)

The following works regardless of the LANG setting, but would not properly show unicode characters if the terminal was using a different unicode encoding:

print(x.encode('utf-8'))

The desired behavior would be to always show unicode in the terminal if it is possible and show some encoding if the terminal does not support unicode. For example, the output would be UTF-8 encoded if the terminal only supported ascii. Basically, the goal is to do the same thing as the python print function when it works, but in the cases where the print function fails, use some default encoding.

解决方案

You can handle the LANG=C case by telling sys.stdout to default to UTF-8 in cases when it would otherwise default to ASCII.

import sys, codecs

if sys.stdout.encoding is None or sys.stdout.encoding == 'ANSI_X3.4-1968':
    utf8_writer = codecs.getwriter('UTF-8')
    if sys.version_info.major < 3:
        sys.stdout = utf8_writer(sys.stdout, errors='replace')
    else:
        sys.stdout = utf8_writer(sys.stdout.buffer, errors='replace')

print(u'\N{snowman}')

The above snippet fulfills your requirements: it works in Python 2.7 and 3.4, and it doesn't break when LANG is in a non-UTF-8 setting such as C.

It is not a new technique, but it's surprisingly hard to find in the documentation. As presented above, it actually respects non-UTF-8 settings such as ISO 8859-*. It only defaults to UTF-8 if Python would have bogusly defaulted to ASCII, breaking the application.

这篇关于在python中打印unicode字符串,无论环境如何的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆