字符编码在python中用'u2019'替换' [英] Character encoding in python to replace 'u2019' with '

查看:1260
本文介绍了字符编码在python中用'u2019'替换'的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经尝试了很多方法来编码这个到最终结果BACK RUSHIN,最重要的字符是正确的撇号'

I have tried numerous ways to encode this to the end result "BACK RUSHIN'" with the most important character being the right apostrophe '.

我想要一种方法来获得这个结果使用一些内置的函数Python有没有区分正常字符串和一个unicode字符串。

I would like a way of getting to this end result using some of the built in functions Python has where there is no discrimination between a normal string and a unicode string.

这是我用来检索字符串的代码: str(unicode(etree.tostring ('path')[0],method ='text',encoding ='utf-8'),errors ='ignore'))strip()

This was the code I was using to retrieve the string: str(unicode(etree.tostring(root.xpath('path')[0],method='text', encoding='utf-8'),errors='ignore')).strip()

结果是:'BACK RUSHIN'缺少撇号'

另一种方法是: root.xpath('path / text()')

结果是: u'BACK RUSHIN\\\’'在python中。

And that result was: u'BACK RUSHIN\u2019' in python.

最后,如果我尝试: u'BACK RUSHIN\\\’'encode('ascii','replace')

结果是:'BACK RUSHIN?'

请不要替换功能,喜欢使用pythons编解码库。
也不打印字符串,因为它被保存在变量中。

Please no replace functions, I would like to make use of pythons codec libraries. Also no printing the string because it is being held in a variable.

感谢

推荐答案

>>> import unidecode
>>> unidecode.unidecode(u'BACK RUSHIN\u2019')
"BACK RUSHIN'"

unidecode

这篇关于字符编码在python中用'u2019'替换'的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆