Unicode 错误序号不在范围内 [英] Unicode error Ordinal not in range

查看:52
本文介绍了Unicode 错误序号不在范围内的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对我来说 unicode 的奇怪错误.我处理 unicode 很好,但是当我今天早上运行它时,u'u201d' 给出了错误并给了我

UnicodeError: ASCII encoding error: ordinal not in range(128)

我查找了代码,显然是 utf-32,但是当我尝试在解释器中对其进行解码时:

c = u'u201d'c.decode('utf-32', '替换')

或任何其他与此相关的操作,它只是无法在任何编解码器中识别它,但我发现它是正确的双引号"

我明白了:

回溯(最近一次调用最后一次):文件<pyshell#154>",第 1 行,在 <module> 中c.decode('utf-32')文件C:Python27libencodingsutf_32.py",第 11 行,解码中返回 codecs.utf_32_decode(input, errors, True)UnicodeEncodeError: 'ascii' 编解码器无法对位置 0 的字符 u'u201d' 进行编码:序号不在范围内 (128)

解决方案

您已经有了一个 unicode 字符串,无需再将其解码为 un​​icode 字符串.

在这种情况下,python 会帮助您尝试先编码它,以便您可以从 utf-32 解码它.它使用默认编码来执行此操作,恰好是 ASCII.这是一个显式编码,用于向您显示在这种情况下引发的异常:

<预><代码>>>>u'u201d'.encode('ASCII')回溯(最近一次调用最后一次):文件<stdin>",第 1 行,在 <module> 中UnicodeEncodeError: 'ascii' 编解码器无法对位置 0 的字符 u'u201d' 进行编码:序号不在范围内 (128)

简而言之,当你有一个像 u'' 这样的 unicode 文字时,就不需要解码了.

阅读 Python Unicode HOWTO 中的 unicode、编码和默认设置.另一篇关于该主题的宝贵文章是 Joel Spolsky 的基本 Unicode 知识帖子.

Odd error with unicode for me. I was dealing with unicode fine, but when I ran it this morning one item u'u201d' gave error and gives me

UnicodeError: ASCII encoding error: ordinal not in range(128)

I looked up the code and apparently its utf-32 but when I try to decode it in the interpreter:

c = u'u201d'
c.decode('utf-32', 'replace')

Or any other operation with it for that matter, it just doesnt recognize it in any codec but yet I found it as "RIGHT DOUBLE QUOTATION MARK"

I get:

Traceback (most recent call last):
File "<pyshell#154>", line 1, in <module>
    c.decode('utf-32')
  File "C:Python27libencodingsutf_32.py", line 11, in decode
    return codecs.utf_32_decode(input, errors, True)
UnicodeEncodeError: 'ascii' codec can't encode character u'u201d' in position 0: ordinal not in range(128)

解决方案

You already have a unicode string, there is no need to decode it to a unicode string again.

What happens in that case is that python helpfully tries to first encode it for you, so that you can then decode it from utf-32. It uses the default encoding to do so, which happens to be ASCII. Here is an explicit encode to show you the exception raised in that case:

>>> u'u201d'.encode('ASCII')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'u201d' in position 0: ordinal not in range(128)

In short, when you have a unicode literal like u'', there is no need to decode it.

Read up on the unicode, encodings, and default settings in the Python Unicode HOWTO. Another invaluable article on the subject is Joel Spolsky's Minimun Unicode knowledge post.

这篇关于Unicode 错误序号不在范围内的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆