格式错误的JSON字符串可以成功解析吗? [英] Can a malformed JSON string be parsed successfully?

查看:72
本文介绍了格式错误的JSON字符串可以成功解析吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是一个示例字符串:

String s = "{\"source\": \"another \"quote inside\" text\"}";

解析此内容的最佳方法是什么?我已经尝试了4种解析器: json-lib json-simple gson Grails内置JSON解析器.

What's the best way to parse this? I've already tried 4 parsers: json-lib, json-simple, gson, and Grails built-in JSON parser.

我正在使用Java,我想知道在捕获MalformedJsonException之类的东西之后是否有一种解决字符串的方法.

I'm using Java and I want to know if there's a way to fix the string after catching a MalformedJsonException or something.

注意:或者这可能是Twitter API中的错误?这是一个示例响应字符串:

Note: Or is this might be a bug in Twitter API? Here's a sample response string:

{
    "coordinates": null,
    "user": {
        "is_translator": false,
        "show_all_inline_media": false,
        "following": null,
        "geo_enabled": false,
        "profile_background_color": "C0DEED",
        "listed_count": 11,
        "profile_background_image_url": "http://a3.twimg.com/a/1298064126/images/themes/theme1/bg.png",
        "favourites_count": 4,
        "followers_count": 66,
        "contributors_enabled": false,
        "statuses_count": 1078,
        "time_zone": "Tokyo",
        "profile_text_color": "333333",
        "friends_count": 51,
        "profile_sidebar_fill_color": "DDEEF6",
        "id_str": "107723125",
        "profile_background_tile": false,
        "created_at": "Sat Jan 23 14:16:03 +0000 2010",
        "profile_image_url": "http://a3.twimg.com/profile_images/652140488/--------------_normal.jpg",
        "description": "Mu8ecdu56e3u306eu56e3u9577u3068u30eau30fcu30c0u30fcu3067u3059u3002u8da3u5473u306fu7af6u99acu306eu4e88u60f3u3068u30b0u30e9u30c3u30d7u30eau30f3u30b0u3068u6253u6483u3092u30e1u30a4u30f3u3068u3057u3066u3044u307eu3059u3063uff01",
        "location": "u5bccu5c71u770c",
        "notifications": null,
        "profile_link_color": "0084B4",
        "protected": false,
        "screen_name": "mattsun0209",
        "follow_request_sent": null,
        "lang": "ja",
        "profile_sidebar_border_color": "C0DEED",
        "name": "u307eu3063u3064u3093",
        "verified": false,
        "id": 107723125,
        "profile_use_background_image": true,
        "utc_offset": 32400,
        "url": null
    },
    "in_reply_to_screen_name": null,
    "in_reply_to_status_id": null,
    "in_reply_to_status_id_str": null,
    "in_reply_to_user_id": null,
    "text": "u3042u30fcu3001u7d50u819cu708eu306bu306au3063u3066u3057u307eu3063u305fu3002",
    "contributors": null,
    "retweeted": false,
    "in_reply_to_user_id_str": null,
    "retweet_count": 0,
    "source": "u003Ca href="http: //twtr.jp" rel="nofollow"u003EKeitai Webu003C/au003E",
    "id_str": "42128197566861312",
    "created_at": "Mon Feb 28 07:45:19 +0000 2011",
    "geo": null,
    "entities": {
        "hashtags": [],
        "user_mentions": [],
        "urls": []
    },
    "truncated": false,
    "place": null,
    "id": 42128197566861312,
    "favorited": false
}

记下source属性:

"source": "u003Ca href="http: //twtr.jp" rel="nofollow"u003EKeitai Webu003C/au003E"

推荐答案

恐怕这是典型的垃圾进,垃圾出"的情况. JSON 无效,因此您无法正确解析.您只能猜测它的含义.现在,我们的人可以很明显地猜到其意图(显然),但是在解析器级别上要困难得多.

I'm afraid that's a classic "garbage in, garbage out" situation. The JSON is invalid, and so you can't parse it properly. You can only guess at what it's meant to be. Now, we humans can guess pretty well at what was intended (obviously), but that's much more difficult at a parser level.

如果始终知道您得到的是无效的source属性,则可以在反序列化字符串之前对其进行预处理,但是真正的解决方法必须是无效数据的来源. Twitter或任何类似的手段(提供).我假设这是您收到的实际字符串数据,而不是处理后的形式.

If you know that consistently you're getting this invalid source property, you could pre-process the string before deserializing it, but the real fix has to be at the source of the invalid data — Twitter or whatever twit (as it were) is providing it. I'm assuming that this is the actual string data you've received, and not a processed form of it.

这篇关于格式错误的JSON字符串可以成功解析吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆