Python:替换字符串列表中的非 ascii 字符 [英] Python: Replace non ascii characters in a list of strings

查看:67
本文介绍了Python:替换字符串列表中的非 ascii 字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道在 stackoverflow 上有很多非 ascii 字符问题,但由于我是一个新手,我没有成功实现它们的运气,而且我发现整个unicode"概念难以理解.

所以我有一个清单 -

mylist = [苹果"、三星"、东芝"、不知道"、想不起来"]

我想访问索引 3 和 4 处的单引号并将它们替换为撇号.

我试过了:

# -*- 编码:utf-8 -*-mylist = ["你好", "不知道", "不知道", "想不起来"]在 mylist 中的话:word.replace(u"'", "'")打印我的列表

我收到以下错误:

UnicodeDecodeError: 'ascii' 编解码器无法解码位置 3 中的字节 0xe2:序号不在范围内 (128)

不确定这是否有用,但我使用的是 Python 2.x 版,我知道如果我使用的是 3.x 版,则可能不会出现此问题.

谢谢!

解决方案

>>>mylist = [苹果"、三星"、东芝"、不知道"、想不起来"]>>>[item.replace('\xe2\x80\x99',"'") for mylist 中的项目]['苹果'、'三星'、'东芝'、不知道"、想不起来"]

如果所有项目都已经是 unicode:

<预><代码>>>>mylist = [u"apple", u"samsung", u"toshiba", u"不知道", u"不记得"]>>>[item.replace(u''',u"'") for mylist 中的项目][u'apple'、u'samsung'、u'toshiba'、u不知道"、u想不起来"]

I understand there are many non ascii characters questions on stackoverflow but since I'm a total newb I've had no luck in successfully implementing them, plus I find the whole 'unicode' concept difficult to understand.

So I have a list -

mylist = ["apple", "samsung", "toshiba", "Don’t know", "Can’t recall"] 

I would like to access the single quote marks at index 3 and 4 and replace them with an apostrophe.

I tried this:

# -*- coding: utf-8 -*-
mylist = ["hello", "don't know", "Don’t know", "Can't recall"]
for word in mylist:
    word.replace(u"’", "'")
print mylist

I get the following error:

UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 3: ordinal not in range(128)

Not sure if this is useful but I am using python version 2.x and I know that this problem may not occur if I was using version 3.

Thanks!

解决方案

>>> mylist = ["apple", "samsung", "toshiba", "Don’t know", "Can’t recall"]
>>> [item.replace('\xe2\x80\x99',"'") for item in mylist]
['apple', 'samsung', 'toshiba', "Don't know", "Can't recall"]

If all the items are already unicode:

>>> mylist = [u"apple", u"samsung", u"toshiba", u"Don’t know", u"Can’t recall"]
>>> [item.replace(u'’',u"'") for item in mylist]
[u'apple', u'samsung', u'toshiba', u"Don't know", u"Can't recall"]

这篇关于Python:替换字符串列表中的非 ascii 字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆