在python中转换unicode字符串 [英] Transform unicode string in python

查看:267
本文介绍了在python中转换unicode字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

{u'Status': u'OK', u'City': u'Ciri\xe8', u'TimezoneName': '', u'ZipPostalCode': '', u'CountryCode': u'IT', u'Dstoffset': u'0', u'Ip': u'x.x.x.x', u'Longitude': u'7.6', u'CountryName': u'Italy', u'RegionCode': u'12', u'Latitude': u'45.2333', u'Isdst': '', u'Gmtoffset': u'0', u'RegionName': u'Piemonte'}

这是我的对象的输出。我想访问城市,但它是编码。如何读取所有参数并对其进行解码

This is the output of my object. I would like to access City but It's encoded. How can I read all parameters and decode it

>>> data['City']
u'Ciri\xe8'

>>>data['City'].decode('utf-8')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe8' in position 4: ordinal not in range(128)

我想要明文不是unicode字符串。谢谢!

I want plaintext not unicode string. Thank you!

推荐答案

阅读: http://nedbatchelder.com/text/unipain.html

然后打印:

>>> data = {u'City':u'Ciri\xe8'}
>>> data['City']
u'Ciri\xe8'
>>> print data['City']
Ciriè

如果不打印,Python打印字符串的安全表示,表示它是Unicode文本 u',它包含一个非ASCII字符 \xe8 print 尝试通过对终端编码中的Unicode字符串进行编码来显示非ASCII字符。如果字符串包含终端编码不支持的字符,则可能会失败:

If you don't print it, Python prints a safe representation of the string, indicating it is Unicode text u'', and that it contains a non-ASCII character \xe8. print attempts to display the non-ASCII character by encoding the Unicode string in the terminal encoding. It may fail if the string contains characters that aren't supported by the terminal encoding:

>>> print u'\xe8'
è
>>> print u'\x81'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "d:\dev\Python27\lib\encodings\cp437.py", line 12, in encode
    return codecs.charmap_encode(input,errors,encoding_map)
UnicodeEncodeError: 'charmap' codec can't encode character u'\x81' in position 0: character maps to <undefined>

在上面的例子中,代码页437 支持Unicode字符U + 00E8,但不支持U + 0081。

In the above example, code page 437 supports Unicode character U+00E8, but not U+0081.

这篇关于在python中转换unicode字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆