json.dumps \ u转义unicode到utf8 [英] json.dumps \u escaped unicode to utf8
问题描述
I came from this old discussion, but the solution didn't help much as my original data was encoded differently:
我的原始数据已经用unicode编码,我需要输出为UTF-8
My original data was already encoded in unicode, I need to output as UTF-8
data={"content":u"\u4f60\u597d"}
当我尝试转换为utf时:
When I try to convert to utf:
json.dumps(data, indent=1, ensure_ascii=False).encode("utf8")
我得到的输出是"content":ä½å¥½"
,预期的结果应该是"content":你好"
the output I get is
"content": "ä½ å¥½"
and the expected out put should be
"content": "你好"
我尝试了没有 ensure_ascii = false
的情况,并且输出变成了普通的未转义的"content":"\ u4f60 \ u597d"
I tried without ensure_ascii=false
and the output becomes plain unescaped "content": "\u4f60\u597d"
如何将以前的\ u转义的json转换为UTF-8?
How can I convert the previously \u escaped json to UTF-8?
推荐答案
您拥有 UTF-8 JSON数据:
You have UTF-8 JSON data:
>>> import json
>>> data = {'content': u'\u4f60\u597d'}
>>> json.dumps(data, indent=1, ensure_ascii=False)
u'{\n "content": "\u4f60\u597d"\n}'
>>> json.dumps(data, indent=1, ensure_ascii=False).encode('utf8')
'{\n "content": "\xe4\xbd\xa0\xe5\xa5\xbd"\n}'
>>> print json.dumps(data, indent=1, ensure_ascii=False).encode('utf8')
{
"content": "你好"
}
我的终端只是发生要配置为处理UTF-8,因此将UTF-8字节打印到我的终端会产生所需的输出.
My terminal just happens to be configured to handle UTF-8, so printing the UTF-8 bytes to my terminal produced the desired output.
但是,如果您的终端没有设置为 用于此类输出,则您的终端会显示错误"字符:
However, if your terminal is not set up for such output, it is your terminal that then shows 'wrong' characters:
>>> print json.dumps(data, indent=1, ensure_ascii=False).encode('utf8').decode('latin1')
{
"content": "ä½ å¥½"
}
请注意,我是如何将数据解码到Latin-1以便故意误读UTF-8字节.
Note how I decoded the data to Latin-1 to deliberately mis-read the UTF-8 bytes.
这不是Python问题;这是您使用任何用于读取这些字节的工具来处理UTF-8字节的问题.
This isn't a Python problem; this is a problem with how you are handling the UTF-8 bytes in whatever tool you used to read these bytes.
这篇关于json.dumps \ u转义unicode到utf8的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!