Mac os X 终端中的 Python unicode [英] Python unicode in Mac os X terminal

查看:21
本文介绍了Mac os X 终端中的 Python unicode的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

谁能给我解释一下这个奇怪的事情:

当我在 python shell 中输入以下 Cyrillic 字符串时:

<预><代码>>>>打印 'абвгд'一夜情

但是当我输入:

<预><代码>>>>打印 u'абвгд'回溯(最近一次调用最后一次):文件<stdin>",第 1 行,在 <module> 中UnicodeEncodeError: 'ascii' 编解码器无法对位置 0-9 中的字符进行编码:序号不在范围内 (128)

由于第一个 tring 正确出现,我认为我的 OS X 终端可以表示 unicode,但事实证明在第二种情况下不能.为什么?

解决方案

>>>打印 'абвгд'一夜情

当您输入某些字符时,您的终端将决定这些字符如何呈现给应用程序.您的终端可能会向应用程序提供编码为 utf-8、ISO-8859-5 或什至只有您的终端才能理解的字符.Python 将这些字符作为一些字节序列获取.然后python按原样打印出这些字节,您的终端以某种方式解释它们以显示字符.由于您的终端通常以与之前编码它们相同的方式解释字节,因此所有内容都按照您输入的方式显示.

<预><代码>>>>u'абвгд'

在这里你输入一些作为字节序列到达 python 解释器的字符,可能由终端以某种方式编码.使用 u 前缀,python 尝试将此数据转换为 unicode.要正确执行此操作,python 必须知道您的终端使用什么编码.在您的情况下,Python 猜测您的终端编码将是 ASCII,但接收到的数据与此不匹配,因此您会收到编码错误.

在交互式会话中创建 unicode 字符串的直接方法是这样的:

<预><代码>>>>us = 'абвгд'.decode('my-terminal-encoding')

在文件中,您还可以使用特殊的模式行指定文件的编码:

# -*- 编码:ISO-8859-5 -*-我们 = u'абвгд'

对于设置默认输入编码的其他方法,您可以查看sys.setdefaultencoding(...)sys.stdin.encoding.

Can someone explain to me this odd thing:

When in python shell I type the following Cyrillic string:

>>> print 'абвгд'
абвгд

but when I type:

>>> print u'абвгд'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-9: ordinal not in range(128)

Since the first tring came out correctly, I reckon my OS X terminal can represent unicode, but it turns out it can't in the second case. Why ?

解决方案

>>> print 'абвгд'
абвгд

When you type in some characters, your terminal decides how these characters are represented to the application. Your terminal might give the characters to the application encoded as utf-8, ISO-8859-5 or even something that only your terminal understands. Python gets these characters as some sequence of bytes. Then python prints out these bytes as they are, and your terminal interprets them in some way to display characters. Since your terminal usually interprets the bytes the same way as it encoded them before, everything is displayed like you typed it in.

>>> u'абвгд'

Here you type in some characters that arrive at the python interpreter as a sequence of bytes, maybe encoded in some way by the terminal. With the u prefix python tries to convert this data to unicode. To do this correctly python has to known what encoding your terminal uses. In your case it looks like Python guesses your terminals encoding would be ASCII, but the received data doesn't match that, so you get an encoding error.

The straight forward way to create unicode strings in an interactive session would therefore be something like this this:

>>> us = 'абвгд'.decode('my-terminal-encoding')

In files you can also specify the encoding of the file with a special mode line:

# -*- encoding: ISO-8859-5 -*-
us = u'абвгд'

For other ways to set the default input encoding you can look at sys.setdefaultencoding(...) or sys.stdin.encoding.

这篇关于Mac os X 终端中的 Python unicode的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆