string.replace非ascii字符 [英] string.replace non-ascii characters
问题描述
问候Pythonistas。我最近用string.replace发现了一个奇怪的异常
。它似乎随机地不处理序数值127的
字符。我遇到了这个问题,同时从ebay下载拍卖网页并试图替换
" \ xa0" (dec 160,在iso-8859-1中的字符串)我从
urllib2获得的字符串。然而今天,一切都很好,没有任何问题。可悲的是,我没有保存确切的错误消息,但是我认为这是在string.replace上抛出的一个
ValueError,而且消息是要什么来对付
效果字符值不在范围内(128)。
一些谷歌搜索似乎表明其他人报告了类似的问题:b $ b b:b />
http:// mail.python.org/pipermail/pyt...ly/391617.html
任何人都对我有任何启发性建议吗?
-
Sam Peterson
skpeterson在nospam ucdavis.edu
"如果程序员付钱删除代码而不是添加代码,
软件会好得多 - 未知
Samuel Karl Peterson写道:
Greetings Pythonistas。我最近用string.replace发现了一个奇怪的异常
。它似乎随机地不处理序数值127的
字符。我遇到了这个问题,同时从ebay下载拍卖网页并试图替换
" \ xa0" (dec 160,在iso-8859-1中的字符串)我从
urllib2获得的字符串。然而今天,一切都很好,没有任何问题。可悲的是,我没有保存确切的错误消息,但是我认为这是在string.replace上抛出的一个
ValueError,而且消息是要什么来对付
效果字符值不在范围内(128)。
是这样的吗?
>> u''\xa0''。replace(''\ xa0'','''')
回溯(最近一次调用最后一次):
文件"<交互式输入>",第1行,< module>
UnicodeDecodeError :''ascii''编解码器不能解码位置0的字节0xa0:
序数不在范围内(128)
你可能会得到它你是混合str和unicode。如果两个字符串都是一种类型或另一种类型的b $ b,你应该没问题:
>> u''\ xa0''。replace(u''\xa0'','''')
u''''
>> ''\ xa0''。replace(''\ xa0'','''')
''' '
STeVe
Steven Bethard< st ************ @ gmail.comon太阳,2007年2月11日22:23:59
-0700因此走了出来并宣布:
Samuel Karl Peterson写道:
Greetings Pythonistas。我最近用string.replace发现了一个奇怪的异常
。它似乎随机地不处理序数值127的
字符。我遇到了这个问题,同时从ebay下载拍卖网页并试图替换
" \ xa0" (dec 160,在iso-8859-1中的字符串)我从
urllib2获得的字符串。然而今天,一切都很好,没有任何问题。可悲的是,我没有保存确切的错误消息,但是我认为这是在string.replace上抛出的一个
ValueError,而且消息是要什么来对付
效果字符值不在范围内(128)。
是这样的吗?
>> u''\xa0''。replace(''\ xa0'','''')
Traceback(最近一次调用最后一次) ):
文件"<交互式输入>",第1行,< module>
UnicodeDecodeError:''ascii''编解码器无法解码字节0xa0在位置
0:序数不在范围内(128)
是的,看起来正是发生了什么,谢谢。我想知道
为什么我有一个unicode字符串。我以为urllib2总是吐出
一个普通的字符串。好吧。
u''\xa0''。encode(''latin-1'')。replace(''\ xa0'',"")
Horray。
-
Sam Peterson
skpeterson在nospam ucdavis.edu
如果程序员付费删除代码而不是添加代码,
软件会好得多 - 未知
En Mon,2007年2月12日02:38:29 -0300,Samuel Karl Peterson
< sk **** **** @nospam.please.ucdavis.eduescribió:
很抱歉偷了这个帖子!这只与您的签名有关:
"如果程序员付费删除代码而不是添加代码,
软件会很多更好" - 不知道
我上周就这么做了。从1000
线路模块中删除了大约250条无用的线路。我认为原始编码器没有阅读过
字典示例的教程:* all * functions返回了一个字典或
字典列表!当然在这里使用不同的名字和
那里,呃......我只是扔几个类和容器,删除所有
无意义的包装/拆包数据来回传递,净价格减少25%(并且稳健性大幅增加,
可维护性等)。
如果我支付的行数*写的*不会很好
交易:)
-
Gabriel Genellina
Greetings Pythonistas. I have recently discovered a strange anomoly
with string.replace. It seemingly, randomly does not deal with
characters of ordinal value 127. I ran into this problem while
downloading auction web pages from ebay and trying to replace the
"\xa0" (dec 160, nbsp char in iso-8859-1) in the string I got from
urllib2. Yet today, all is fine, no problems whatsoever. Sadly, I
did not save the exact error message, but I believe it was a
ValueError thrown on string.replace and the message was something to
the effect "character value not within range(128).
Some googling seemed to indicate other people have reported similar
troubles:
http://mail.python.org/pipermail/pyt...ly/391617.html
Anyone have any enlightening advice for me?
--
Sam Peterson
skpeterson At nospam ucdavis.edu
"if programmers were paid to remove code instead of adding it,
software would be much better" -- unknown
Samuel Karl Peterson wrote:Greetings Pythonistas. I have recently discovered a strange anomoly
with string.replace. It seemingly, randomly does not deal with
characters of ordinal value 127. I ran into this problem while
downloading auction web pages from ebay and trying to replace the
"\xa0" (dec 160, nbsp char in iso-8859-1) in the string I got from
urllib2. Yet today, all is fine, no problems whatsoever. Sadly, I
did not save the exact error message, but I believe it was a
ValueError thrown on string.replace and the message was something to
the effect "character value not within range(128).Was it something like this?
>>u''\xa0''.replace(''\xa0'', '''')
Traceback (most recent call last):
File "<interactive input>", line 1, in <module>
UnicodeDecodeError: ''ascii'' codec can''t decode byte 0xa0 in position 0:
ordinal not in range(128)
You might get that if you''re mixing str and unicode. If both strings are
of one type or the other, you should be okay:
>>u''\xa0''.replace(u''\xa0'', '''')
u''''
>>''\xa0''.replace(''\xa0'', '''')
''''
STeVe
Steven Bethard <st************@gmail.comon Sun, 11 Feb 2007 22:23:59
-0700 didst step forth and proclaim thus:
Samuel Karl Peterson wrote:Greetings Pythonistas. I have recently discovered a strange anomoly
with string.replace. It seemingly, randomly does not deal with
characters of ordinal value 127. I ran into this problem while
downloading auction web pages from ebay and trying to replace the
"\xa0" (dec 160, nbsp char in iso-8859-1) in the string I got from
urllib2. Yet today, all is fine, no problems whatsoever. Sadly, I
did not save the exact error message, but I believe it was a
ValueError thrown on string.replace and the message was something to
the effect "character value not within range(128).
Was it something like this?
>>u''\xa0''.replace(''\xa0'', '''')
Traceback (most recent call last):
File "<interactive input>", line 1, in <module>
UnicodeDecodeError: ''ascii'' codec can''t decode byte 0xa0 in position
0: ordinal not in range(128)Yeah that looks like exactly what was happening, thank you. I wonder
why I had a unicode string though. I thought urllib2 always spat out
a plain string. Oh well.
u''\xa0''.encode(''latin-1'').replace(''\xa0'', " ")
Horray.
--
Sam Peterson
skpeterson At nospam ucdavis.edu
"if programmers were paid to remove code instead of adding it,
software would be much better" -- unknown
En Mon, 12 Feb 2007 02:38:29 -0300, Samuel Karl Peterson
<sk********@nospam.please.ucdavis.eduescribió:
Sorry to steal the thread! This is only related to your signature:
"if programmers were paid to remove code instead of adding it,
software would be much better" -- unknownI just did that last week. Around 250 useless lines removed from a 1000
lines module. I think the original coder didn''t read the tutorial past the
dictionary examples: *all* functions returned a dictionary or list of
dictionaries! Of course using different names for the same thing here and
there, ugh... I just throw in a few classes and containers, removed all
the nonsensical packing/unpacking of data going back and forth, for a net
decrease of 25% in size (and a great increase in robustness,
maintainability, etc).
If I were paid for the number of lines *written* that would not be a great
deal :)
--
Gabriel Genellina
这篇关于string.replace非ascii字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!