python...使用linux时的编码问题> [英] python... encoding issue when using linux >

查看:20
本文介绍了python...使用linux时的编码问题>的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

一个编码问题的简单测试程序:

simple test program of an encoding issue:

#!/bin/env python
# -*- coding: utf-8 -*-
print u"Råbjerg"      # >>> unicodedata.name(u"å") = 'LATIN SMALL LETTER A WITH RING ABOVE'

这是我从 debian 命令框使用它时得到的结果,我不明白为什么在这里使用重定向会破坏这个东西,因为我可以在不使用的情况下正确看到它.

here is what i get when i use it from a debian command box, i do not understand why using redirect here broke the thing, as i can see it correctly when using without.

有人可以帮助理解我错过了什么吗?打印这些字符的正确方法应该是什么,以便它们在任何地方都可以?

can someone help to understand what i have missed? and what should the right way to print this characters so that they are ok everywhere?

$ python testu.py
Råbjerg

$ python testu.py > A
Traceback (most recent call last):
  File "testu.py", line 3, in <module>
    print u"Råbjerg"
UnicodeEncodeError: 'ascii' codec can't encode character u'xe5' in position 1: ordinal not in range(128)

使用 debian Debian GNU/Linux 6.0.7 (squeeze) 配置:

using debian Debian GNU/Linux 6.0.7 (squeeze) configured with:

$ locale
LANG=fr_FR.UTF-8
LANGUAGE=
LC_CTYPE="fr_FR.UTF-8"
LC_NUMERIC="fr_FR.UTF-8"
LC_TIME="fr_FR.UTF-8"
LC_COLLATE="fr_FR.UTF-8"
LC_MONETARY="fr_FR.UTF-8"
LC_MESSAGES="fr_FR.UTF-8"
LC_PAPER="fr_FR.UTF-8"
LC_NAME="fr_FR.UTF-8"
LC_ADDRESS="fr_FR.UTF-8"
LC_TELEPHONE="fr_FR.UTF-8"
LC_MEASUREMENT="fr_FR.UTF-8"
LC_IDENTIFICATION="fr_FR.UTF-8"
LC_ALL=

从后面看到的其他类似问题中可以看到

from other similar questions seen later from the pointing done below

#!/bin/env python1
# -*- coding: utf-8 -*-
import sys, locale
s = u"Råbjerg"      # >>> unicodedata.name(u"å") = 'LATIN SMALL LETTER A WITH RING ABOVE'
if sys.stdout.encoding is None: # if it is a pipe, seems python2 return None
    s = s.encode(locale.getpreferredencoding())
print s

推荐答案

重定向输出时,sys.stdout 未连接到终端,Python 无法确定输出编码.当直接输出时,Python 可以检测到 sys.stdout 是一个 TTY,并在打印 unicode 时使用为该 TTY 配置的编解码器.

When redirecting the output, sys.stdout is not connected to a terminal and Python cannot determine the output encoding. When not directing the output, Python can detect that sys.stdout is a TTY and will use the codec configured for that TTY when printing unicode.

设置PYTHONIOENCODING环境变量告诉Python在这种情况下使用什么编码,或者显式编码.

Set the PYTHONIOENCODING environment variable to tell Python what encoding to use in such cases, or encode explicitly.

这篇关于python...使用linux时的编码问题&gt;的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆