urllib.unquote和unicode [英] urllib.unquote and unicode

查看:94
本文介绍了urllib.unquote和unicode的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

以下代码段导致(至少)

最后三个主要版本的不同结果:

< blockquote class =post_quotes>
>> import urllib
urllib.unquote(u''%94'')



#Python 2.3.4

你''%94''


#Python 2.4.2

UnicodeDecodeError:''ascii''编解码器不能解码位置0的字节0x94:

序列不在范围内(128)

#Python 2.5

u''\x94''


当前版本是否为正确版本一个或这个函数应该每隔一周更换一次吗?


乔治

解决方案



George Sakkis写道:


以下片段导致(至少)<不同的结果

最后三个主要版本:


> import urllib
urllib.unquote(u' '%94'')



#Python 2.3.4

你''%94''


#Python 2.4.2

UnicodeDecodeError:''ascii''编解码器无法解码位置0的字节0x94:

序数不范围内(128)

#Python 2.5

u''\x94''


是当前版本的正确一个或者这个函数应该每隔一周更换一次吗?



恕我直言,结果都没有。通过提高ValueError来拒绝unicode字符串

,或者它应该用ascii

编码编码,结果应该与

urllib相同。 unquote(u''%94''。encode(''ascii''))是''\ x94''。您可以将当前行为视为未定义,就像您将一个随机对象传递给某个函数一样,你可以在不同的python中得到不同的结果

版本。


- Leo


George Sakkis写道:


以下片段导致(至少)

最后三个主要版本的不同结果:


>>> import urllib
urllib.unquote(u''%94'')


#Python 2.4.2

UnicodeDecodeError:''ascii''编解码器无法解码位置0的字节0x94:

序数不范围(128)



Python 2.4.3(#3,2006年8月23日,09:40:15)

[GCC 3.3 .3(SuSE Linux)] on linux2

输入help,copyright ;,信用或许可证或欲获得更多信息。


>> import urllib
urllib.unquote(u"%94" ;)



u''\ x94''


>>>



从上面我推断2.4.2行为被认为是一个bug。


Peter


George Sakkis写道:


以下代码段导致不同的结果(至少)

最后三个主要版本:


>>> import urllib
urllib.unquote(你'%94'')



#Python 2.3.4

u ''%94''


#Python 2.4.2

UnicodeDecodeError:''ascii''编解码器无法解码位置0的字节0x94:

序数不在范围内(128)


#Python 2.5

u''\ x94''


当前版本是正确吗?一个或者这个函数应该每隔一周更换一次吗?



为什么你要将非ASCII Unicode字符串传递给专为

设计的函数,首先修复8位字符串?如果你在引用之前做了正确的编码

,它将在所有Python版本中以相同的方式工作。


< / F>


The following snippet results in different outcome for (at least) the
last three major releases:

>>import urllib
urllib.unquote(u''%94'')

# Python 2.3.4
u''%94''

# Python 2.4.2
UnicodeDecodeError: ''ascii'' codec can''t decode byte 0x94 in position 0:
ordinal not in range(128)

# Python 2.5
u''\x94''

Is the current version the "right" one or is this function supposed to
change every other week ?

George

解决方案


George Sakkis wrote:

The following snippet results in different outcome for (at least) the
last three major releases:

>import urllib
urllib.unquote(u''%94'')


# Python 2.3.4
u''%94''

# Python 2.4.2
UnicodeDecodeError: ''ascii'' codec can''t decode byte 0x94 in position 0:
ordinal not in range(128)

# Python 2.5
u''\x94''

Is the current version the "right" one or is this function supposed to
change every other week ?

IMHO, none of the results is right. Either unicode string should be
rejected by raising ValueError or it should be encoded with ascii
encoding and result should be the same as
urllib.unquote(u''%94''.encode(''ascii'')) that is ''\x94''. You can consider
current behaviour as undefined just like if you pass a random object
into some function you can get different outcome in different python
versions.

-- Leo


George Sakkis wrote:

The following snippet results in different outcome for (at least) the
last three major releases:

>>>import urllib
urllib.unquote(u''%94'')

# Python 2.4.2
UnicodeDecodeError: ''ascii'' codec can''t decode byte 0x94 in position 0:
ordinal not in range(128)

Python 2.4.3 (#3, Aug 23 2006, 09:40:15)
[GCC 3.3.3 (SuSE Linux)] on linux2
Type "help", "copyright", "credits" or "license" for more information.

>>import urllib
urllib.unquote(u"%94")

u''\x94''

>>>

From the above I infer that the 2.4.2 behaviour was considered a bug.

Peter


George Sakkis wrote:

The following snippet results in different outcome for (at least) the
last three major releases:

>>>import urllib
urllib.unquote(u''%94'')


# Python 2.3.4
u''%94''

# Python 2.4.2
UnicodeDecodeError: ''ascii'' codec can''t decode byte 0x94 in position 0:
ordinal not in range(128)

# Python 2.5
u''\x94''

Is the current version the "right" one or is this function supposed to
change every other week ?

why are you passing non-ASCII Unicode strings to a function designed for
fixing up 8-bit strings in the first place? if you do proper encoding
before you quote things, it''ll work the same way in all Python releases.

</F>


这篇关于urllib.unquote和unicode的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆