在python 2.7中使用非ascii字符对json进行编码然后解码 [英] Encoding and then decoding json with non-ascii characters in python 2.7

查看:191
本文介绍了在python 2.7中使用非ascii字符对json进行编码然后解码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个python应用程序,它将一些对象编码为json,将json字符串传递给另一个程序,然后读取该json字符串的可能修改版本.

I have a python application that encodes some objects to json, passes the json string to another program, and then reads in a possibly modified version of that json string.

我需要检查json编码对象有什么变化.但是,我在重新编码非ASCII字符时遇到了麻烦.例如:

I need to check that what's changed with the json encoded objects. However, I'm having trouble with re-encoding non-ascii characters. For example:

x = {'\xe2': None} # a dict with non-ascii keys
y = json.dumps(x,ensure_ascii=False)
y
#> '{"\xe2": null}'

工作正常,但是当我尝试加载json时,我得到了:

works just fine, but when I try to load the json, I get:

json.loads(y)
#> UnicodeDecodeError: 'utf8' codec can't decode byte 0xe2 in position 0
json.loads(y.decode('utf-8','ignore'))
#> "{u'': None}"
json.loads(y.decode('utf-8','replace'))
#> {u'\ufffd': None}

,不幸的是,'\xe2' in {u'\ufffd': None}的计算结果为False

and unfortunately '\xe2' in {u'\ufffd': None} evaluates to False

我敢打赌,有一个简单的解决方案,但是我在Google上进行的所有搜索和搜索都未能找到合适的解决方案.

I'm willing to bet there is a simple solution, but all my googling and searching on SO has failed to find an adequate solution.

推荐答案

解决此问题的最简单方法是转到生成此dict的对象,并将其中的内容正确编码为utf-8.当前,您的密钥编码为 CP-1252 .

The easiest way to fix this is to go to the thing that is generating this dict and properly encode things there as utf-8. Currently, your keys are encoded as CP-1252.

print('\xe2'.decode('cp1252'))
â

如果您无法从源头上解决问题,则需要进行一些后期处理.

If you can't fix at the source, you'll need to do some post-processing.

d = {'\xe2': None}

fixed_d = {k.decode('cp1252'):v for k,v in d.iteritems()}

json.dumps(fixed_d)
Out[24]: '{"\\u00e2": null}'

json_dict_with_unicode_keys = json.dumps(fixed_d)

json_dict_with_unicode_keys
Out[32]: '{"\\u00e2": null}'

print(json.loads(json_dict_with_unicode_keys).keys()[0])
â

(此答案的某些内容假设您使用的是python 2,py3中的unicode处理有所不同)

(Some of the content of this answer assumes you're on python 2, there are differences in unicode handling in py3)

这篇关于在python 2.7中使用非ascii字符对json进行编码然后解码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆