字符编码在python中用'u2019'替换' [英] Character encoding in python to replace 'u2019' with '
问题描述
我已经尝试了很多方法来编码这个到最终结果BACK RUSHIN
,最重要的字符是正确的撇号'
。
I have tried numerous ways to encode this to the end result "BACK RUSHIN'"
with the most important character being the right apostrophe '
.
我想要一种方法来获得这个结果使用一些内置的函数Python有没有区分正常字符串和一个unicode字符串。
I would like a way of getting to this end result using some of the built in functions Python has where there is no discrimination between a normal string and a unicode string.
这是我用来检索字符串的代码: str(unicode(etree.tostring ('path')[0],method ='text',encoding ='utf-8'),errors ='ignore'))strip()
This was the code I was using to retrieve the string: str(unicode(etree.tostring(root.xpath('path')[0],method='text', encoding='utf-8'),errors='ignore')).strip()
结果是:'BACK RUSHIN'
缺少撇号'
。
另一种方法是: root.xpath('path / text()')
结果是: u'BACK RUSHIN\\\’'
在python中。
And that result was: u'BACK RUSHIN\u2019'
in python.
最后,如果我尝试: u'BACK RUSHIN\\\’'encode('ascii','replace')
结果是:'BACK RUSHIN?'
请不要替换功能,喜欢使用pythons编解码库。
也不打印字符串,因为它被保存在变量中。
Please no replace functions, I would like to make use of pythons codec libraries. Also no printing the string because it is being held in a variable.
感谢
推荐答案
>>> import unidecode
>>> unidecode.unidecode(u'BACK RUSHIN\u2019')
"BACK RUSHIN'"
这篇关于字符编码在python中用'u2019'替换'的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!