python ...使用linux时的编码问题> [英] python... encoding issue when using linux >

查看:109
本文介绍了python ...使用linux时的编码问题>的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

编码问题的简单测试程序:

 #!/ bin / env python 
# - * - 编码:utf-8 - * -
打印uRåbjerg#>>>> unicodedata.name(uå)='拉丁语小提琴A与戒指'

这里是当我使用debian命令框时,我得到什么,我不明白为什么使用重定向这里打破了这个事情,因为我可以在没有使用的时候看到它。



有人可以帮助我了解我错过了什么吗?那么打印这个角色的方法应该是什么呢?

  $ python testu.py 
Råbjerg

$ python testu.py> A
追溯(最近的最后一次调用):
文件testu.py,第3行,在< module>
打印uRåbjerg
UnicodeEncodeError:'ascii'编解码器不能在位置1中编码字符u'\xe5':ordinal不在范围内(128)

使用debian Debian GNU / Linux 6.0.7(squeeze)配置:

  $ locale 
LANG = fr_FR.UTF-8
LANGUAGE =
LC_CTYPE =fr_FR.UTF-8
LC_NUMERIC =fr_FR。 UTF-8
LC_TIME =fr_FR.UTF-8
LC_COLLATE =fr_FR.UTF-8
LC_MONETARY =fr_FR.UTF-8
LC_MESSAGES = fr_FR.UTF-8
LC_PAPER =fr_FR.UTF-8
LC_NAME =fr_FR.UTF-8
LC_ADDRESS =fr_FR.UTF-8
LC_TELEPHONE =fr_FR.UTF-8
LC_MEASUREMENT =fr_FR.UTF-8
LC_IDENTIFICATION =fr_FR.UTF-8
LC_ALL =

编辑:从下面的指示中看到的其他类似问题

 #!/ bin / env python1 
# - * - 编码:utf-8 - * -
import sys,locale
s = uRåbjerg#>> ;> unicodedata.name(uå)='如果sys.stdout.encoding为None,则为'LATIN SMALL LETTER A WITH RING ABOVE'
如果是管道,似乎python2返回None
s = s。编码(locale.getpreferredencoding())
打印s


解决方案

当重定向输出时, sys.stdout 未连接到终端,Python无法确定输出编码。当指示输出时,Python可以检测到 sys.stdout 是一个TTY,并且在打印unicode时将使用为该TTY配置的编解码器。 / p>

设置 PYTHONIOENCODING 环境变量来告诉Python在这种情况下要使用什么编码,或者明确编码。


simple test program of an encoding issue:

#!/bin/env python
# -*- coding: utf-8 -*-
print u"Råbjerg"      # >>> unicodedata.name(u"å") = 'LATIN SMALL LETTER A WITH RING ABOVE'

here is what i get when i use it from a debian command box, i do not understand why using redirect here broke the thing, as i can see it correctly when using without.

can someone help to understand what i have missed? and what should the right way to print this characters so that they are ok everywhere?

$ python testu.py
Råbjerg

$ python testu.py > A
Traceback (most recent call last):
  File "testu.py", line 3, in <module>
    print u"Råbjerg"
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe5' in position 1: ordinal not in range(128)

using debian Debian GNU/Linux 6.0.7 (squeeze) configured with:

$ locale
LANG=fr_FR.UTF-8
LANGUAGE=
LC_CTYPE="fr_FR.UTF-8"
LC_NUMERIC="fr_FR.UTF-8"
LC_TIME="fr_FR.UTF-8"
LC_COLLATE="fr_FR.UTF-8"
LC_MONETARY="fr_FR.UTF-8"
LC_MESSAGES="fr_FR.UTF-8"
LC_PAPER="fr_FR.UTF-8"
LC_NAME="fr_FR.UTF-8"
LC_ADDRESS="fr_FR.UTF-8"
LC_TELEPHONE="fr_FR.UTF-8"
LC_MEASUREMENT="fr_FR.UTF-8"
LC_IDENTIFICATION="fr_FR.UTF-8"
LC_ALL=

EDIT: from other similar questions seen later from the pointing done below

#!/bin/env python1
# -*- coding: utf-8 -*-
import sys, locale
s = u"Råbjerg"      # >>> unicodedata.name(u"å") = 'LATIN SMALL LETTER A WITH RING ABOVE'
if sys.stdout.encoding is None: # if it is a pipe, seems python2 return None
    s = s.encode(locale.getpreferredencoding())
print s

解决方案

When redirecting the output, sys.stdout is not connected to a terminal and Python cannot determine the output encoding. When not directing the output, Python can detect that sys.stdout is a TTY and will use the codec configured for that TTY when printing unicode.

Set the PYTHONIOENCODING environment variable to tell Python what encoding to use in such cases, or encode explicitly.

这篇关于python ...使用linux时的编码问题&gt;的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆