string.replace非ascii字符 [英] string.replace non-ascii characters

查看:91
本文介绍了string.replace非ascii字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

问候Pythonistas。我最近用string.replace发现了一个奇怪的异常

。它似乎随机地不处理序数值127的
字符。我遇到了这个问题,同时从ebay下载拍卖网页并试图替换

" \ xa0" (dec 160,在iso-8859-1中的字符串)我从

urllib2获得的字符串。然而今天,一切都很好,没有任何问题。可悲的是,我没有保存确切的错误消息,但是我认为这是在string.replace上抛出的一个

ValueError,而且消息是要什么来对付

效果字符值不在范围内(128)。


一些谷歌搜索似乎表明其他人报告了类似的问题:b $ b b:b />
http:// mail.python.org/pipermail/pyt...ly/391617.html


任何人都对我有任何启发性建议吗?


-

Sam Peterson

skpeterson在nospam ucdavis.edu

"如果程序员付钱删除代码而不是添加代码,

软件会好得多 - 未知

解决方案

Samuel Karl Peterson写道:


Greetings Pythonistas。我最近用string.replace发现了一个奇怪的异常

。它似乎随机地不处理序数值127的
字符。我遇到了这个问题,同时从ebay下载拍卖网页并试图替换

" \ xa0" (dec 160,在iso-8859-1中的字符串)我从

urllib2获得的字符串。然而今天,一切都很好,没有任何问题。可悲的是,我没有保存确切的错误消息,但是我认为这是在string.replace上抛出的一个

ValueError,而且消息是要什么来对付

效果字符值不在范围内(128)。



是这样的吗?


>> u''\xa0''。replace(''\ xa0'','''')



回溯(最近一次调用最后一次):

文件"<交互式输入>",第1行,< module>

UnicodeDecodeError :''ascii''编解码器不能解码位置0的字节0xa0:

序数不在范围内(128)


你可能会得到它你是混合str和unicode。如果两个字符串都是一种类型或另一种类型的b $ b,你应该没问题:


>> u''\ xa0''。replace(u''\xa0'','''')



u''''


>> ''\ xa0''。replace(''\ xa0'','''')



''' '


STeVe


Steven Bethard< st ************ @ gmail.comon太阳,2007年2月11日22:23:59

-0700因此走了出来并宣布:


Samuel Karl Peterson写道:


Greetings Pythonistas。我最近用string.replace发现了一个奇怪的异常

。它似乎随机地不处理序数值127的
字符。我遇到了这个问题,同时从ebay下载拍卖网页并试图替换

" \ xa0" (dec 160,在iso-8859-1中的字符串)我从

urllib2获得的字符串。然而今天,一切都很好,没有任何问题。可悲的是,我没有保存确切的错误消息,但是我认为这是在string.replace上抛出的一个

ValueError,而且消息是要什么来对付

效果字符值不在范围内(128)。



是这样的吗?


>> u''\xa0''。replace(''\ xa0'','''')



Traceback(最近一次调用最后一次) ):

文件"<交互式输入>",第1行,< module>

UnicodeDecodeError:''ascii''编解码器无法解码字节0xa0在位置

0:序数不在范围内(128)



是的,看起来正是发生了什么,谢谢。我想知道

为什么我有一个unicode字符串。我以为urllib2总是吐出

一个普通的字符串。好吧。


u''\xa0''。encode(''latin-1'')。replace(''\ xa0'',"")


Horray。

-

Sam Peterson

skpeterson在nospam ucdavis.edu

如果程序员付费删除代码而不是添加代码,

软件会好得多 - 未知


En Mon,2007年2月12日02:38:29 -0300,Samuel Karl Peterson

< sk **** **** @nospam.please.ucdavis.eduescribió:


很抱歉偷了这个帖子!这只与您的签名有关:


"如果程序员付费删除代码而不是添加代码,

软件会很多更好" - 不知道



我上周就这么做了。从1000

线路模块中删除了大约250条无用的线路。我认为原始编码器没有阅读过

字典示例的教程:* all * functions返回了一个字典或

字典列表!当然在这里使用不同的名字和

那里,呃......我只是扔几个类和容器,删除所有

无意义的包装/拆包数据来回传递,净价格减少25%(并且稳健性大幅增加,

可维护性等)。

如果我支付的行数*写的*不会很好

交易:)


-

Gabriel Genellina


Greetings Pythonistas. I have recently discovered a strange anomoly
with string.replace. It seemingly, randomly does not deal with
characters of ordinal value 127. I ran into this problem while
downloading auction web pages from ebay and trying to replace the
"\xa0" (dec 160, nbsp char in iso-8859-1) in the string I got from
urllib2. Yet today, all is fine, no problems whatsoever. Sadly, I
did not save the exact error message, but I believe it was a
ValueError thrown on string.replace and the message was something to
the effect "character value not within range(128).

Some googling seemed to indicate other people have reported similar
troubles:

http://mail.python.org/pipermail/pyt...ly/391617.html

Anyone have any enlightening advice for me?

--
Sam Peterson
skpeterson At nospam ucdavis.edu
"if programmers were paid to remove code instead of adding it,
software would be much better" -- unknown

解决方案

Samuel Karl Peterson wrote:

Greetings Pythonistas. I have recently discovered a strange anomoly
with string.replace. It seemingly, randomly does not deal with
characters of ordinal value 127. I ran into this problem while
downloading auction web pages from ebay and trying to replace the
"\xa0" (dec 160, nbsp char in iso-8859-1) in the string I got from
urllib2. Yet today, all is fine, no problems whatsoever. Sadly, I
did not save the exact error message, but I believe it was a
ValueError thrown on string.replace and the message was something to
the effect "character value not within range(128).

Was it something like this?

>>u''\xa0''.replace(''\xa0'', '''')

Traceback (most recent call last):
File "<interactive input>", line 1, in <module>
UnicodeDecodeError: ''ascii'' codec can''t decode byte 0xa0 in position 0:
ordinal not in range(128)

You might get that if you''re mixing str and unicode. If both strings are
of one type or the other, you should be okay:

>>u''\xa0''.replace(u''\xa0'', '''')

u''''

>>''\xa0''.replace(''\xa0'', '''')

''''

STeVe


Steven Bethard <st************@gmail.comon Sun, 11 Feb 2007 22:23:59
-0700 didst step forth and proclaim thus:

Samuel Karl Peterson wrote:

Greetings Pythonistas. I have recently discovered a strange anomoly
with string.replace. It seemingly, randomly does not deal with
characters of ordinal value 127. I ran into this problem while
downloading auction web pages from ebay and trying to replace the
"\xa0" (dec 160, nbsp char in iso-8859-1) in the string I got from
urllib2. Yet today, all is fine, no problems whatsoever. Sadly, I
did not save the exact error message, but I believe it was a
ValueError thrown on string.replace and the message was something to
the effect "character value not within range(128).


Was it something like this?

>>u''\xa0''.replace(''\xa0'', '''')

Traceback (most recent call last):
File "<interactive input>", line 1, in <module>
UnicodeDecodeError: ''ascii'' codec can''t decode byte 0xa0 in position
0: ordinal not in range(128)

Yeah that looks like exactly what was happening, thank you. I wonder
why I had a unicode string though. I thought urllib2 always spat out
a plain string. Oh well.

u''\xa0''.encode(''latin-1'').replace(''\xa0'', " ")

Horray.
--
Sam Peterson
skpeterson At nospam ucdavis.edu
"if programmers were paid to remove code instead of adding it,
software would be much better" -- unknown


En Mon, 12 Feb 2007 02:38:29 -0300, Samuel Karl Peterson
<sk********@nospam.please.ucdavis.eduescribió:

Sorry to steal the thread! This is only related to your signature:

"if programmers were paid to remove code instead of adding it,
software would be much better" -- unknown

I just did that last week. Around 250 useless lines removed from a 1000
lines module. I think the original coder didn''t read the tutorial past the
dictionary examples: *all* functions returned a dictionary or list of
dictionaries! Of course using different names for the same thing here and
there, ugh... I just throw in a few classes and containers, removed all
the nonsensical packing/unpacking of data going back and forth, for a net
decrease of 25% in size (and a great increase in robustness,
maintainability, etc).
If I were paid for the number of lines *written* that would not be a great
deal :)

--
Gabriel Genellina


这篇关于string.replace非ascii字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆