汉字JSON之间的转换错误 [英] conversion error between JSON for Chinese characters

查看:79
本文介绍了汉字JSON之间的转换错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个中文字符串普派",我想使用HTTP POST请求将其从客户端传输到Web服务器.在客户端,我使用以下jquery代码:

I have a Chinese string "普派" that I want to transmit from client to web server using HTTP POST request. At the client side, I use the following jquery code:

$.ajax({
    url: 'http://127.0.0.1:8000/detect/word',
    type: 'POST',
    data: JSON.stringify('普派'),
    success: function(msg) {
        alert(msg);
    }
});

在服务器端,我使用python 3.3:

At the server side, I use python 3.3:

class DictRequestHandler(BaseHTTPRequestHandler):
    def do_POST(self):
        post_data = self.rfile.read(int(self.headers['Content-Length']))
        post_var = json.loads(post_data.decode())

但是结果(post_var)杂乱无章.类型为bytes的变量post_data为: b'"\ xc3 \ xa6 \ xe2 \ x84 \ xa2 \ xc2 \ xae \ xc3 \ xa6 \ xc2 \ xb4 \ xc2 \ xbe"',但要正确转换,应为b'"\ u666e \ u6d3e"'(由json.dumps("普派").encode()获得).您能帮我解决这个问题吗?非常感谢.

But the result (post_var) is messy. The variable post_data of type bytes is: b'"\xc3\xa6\xe2\x84\xa2\xc2\xae\xc3\xa6\xc2\xb4\xc2\xbe"', but to correctly convert, it should be b'"\u666e\u6d3e"' (obtained by json.dumps("普派").encode()). Could you please help me solve this problem? Thank you very much.

推荐答案

JSON.stringify('普派')的结果取决于源文件的编码.请记住,引号之间的真正含义只是一堆字节,只是您的编辑器(或浏览器)将其显示为普派".
如果浏览器正确地检测到您的源编码,那么它就无关紧要,但是如果没有,那么您最终将得到垃圾.
因此,请确保提供正确的文件编码(最好是utf-8).

The result of JSON.stringify('普派') depends on the encoding of your source file. Remember, what's really between that quotes is just a bunch of bytes, it's just your editor (or browser) that displays it as '普派'.
If the browser correctly detects your source encoding, then it shouldn't relly matter, but if it doesn't, then you'll end up with garbage.
So make sure to supply the correct file encoding (which should preferably be utf-8).

要独立于此类依赖于浏览器的解释,请尝试将其更改为JSON.stringify("\u666e\u6d3e").

To be independent of such browser dependent interpretations try changing it to JSON.stringify("\u666e\u6d3e").

json标准不要求在编码时必须用teir unicode转义序列替换unicode字符. 它只是定义了编码应为Unicode ,并允许任何Unicode字符'放在json字符串中,因此JSON.stringify的结果如果将给定的字符编码为utf-8,就不会出错.
两种都应该没问题,因此您应该在服务器端看到的应该是b'"\xe6\x99\xae\xe6\xb4\xbe"'.

The json standard doesn't mandate that unicode characters must be replaced with teir unicode escape sequence on encoding. It just defines that the encoding should be unicode, and allows for 'any unicode character' within json strings, so the result of JSON.stringify isn't wrong if it encodes the given charactes as utf-8.
Either one should be fine, so what you should see on your server side should probably be b'"\xe6\x99\xae\xe6\xb4\xbe"'.

这篇关于汉字JSON之间的转换错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆