从文本中提取JSON [英] Extract JSON from text

查看:212
本文介绍了从文本中提取JSON的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

AJAX调用返回包含JSON字符串的响应文本。我需要:

An AJAX call is returning a response text that includes a JSON string. I need to:


  1. 提取JSON字符串

  2. 修改它

  3. 然后重新插入以更新原始字符串

我不太担心第2步和第3步,但我可以' t弄清楚如何做第1步。我正在考虑使用正则表达式,但我不知道我的JSON如何与嵌套对象或数组有多个级别。

I am not too worried about steps 2 and 3, but I can't figure out how to do step 1. I was thinking about using a regular expression, but I don't know how as my JSON might have multiple levels with nested objects or arrays.

推荐答案

您不能使用正则表达式从任意文本中提取JSON。由于正则表达式通常是功能不足以验证JSON (除非你可以使用PCRE)它们也无法匹配 - 如果可以的话,他们也可以验证JSON。

You cannot use a regex to extract JSON from an arbitrary text. Since regexes are usually not powerful enough to validate JSON (unless you can use PCRE) they also cannot match it - if they could, they could also validate JSON.

但是,如果您知道JSON的顶级元素始终是对象或数组,则可以采用以下方法:

However, if you know that the top-level element of your JSON is always an object or array, you can go by the following approach:


  • 找到第一个开头( { [ )并在你的字符串中最后一次结束(} ] )。

  • 尝试使用 JSON.parse()解析该文本块(包括大括号)。如果成功,则完成并返回解析结果。

  • 获取前一个右大括号并尝试解析该字符串。如果成功,你就会再次完成。

  • 重复此操作,直到你没有支撑或在当前左大括号之前的支撑。

  • 查找在第1步之后的第一个打开大括号。如果你没有找到任何,那么该字符串不包含JSON对象/数组,你可以停止。

  • 转到第2步。

  • Find the first opening ({ or [) and last closing (} or ]) brace in your string.
  • Try to parse that block of text (including the braces) using JSON.parse(). If it succeeded, finish and return the parsed result.
  • Take the previous closing brace and try parsing that string. If it succeeds, you are done again.
  • Repeat this until you got no brace or one that comes before the current opening brace.
  • Find the first opening brace after the one from step 1. If you did not find any, the string did not contain a JSON object/array and you can stop.
  • Go to step 2.

这是一个提取JSON对象并返回对象及其位置的函数。如果你真的需要顶级数组,它应该是扩展:

Here is a function that extracts a JSON object and returns the object and its position. If you really need top-level arrays, too, it should be to extend:

function extractJSON(str) {
    var firstOpen, firstClose, candidate;
    firstOpen = str.indexOf('{', firstOpen + 1);
    do {
        firstClose = str.lastIndexOf('}');
        console.log('firstOpen: ' + firstOpen, 'firstClose: ' + firstClose);
        if(firstClose <= firstOpen) {
            return null;
        }
        do {
            candidate = str.substring(firstOpen, firstClose + 1);
            console.log('candidate: ' + candidate);
            try {
                var res = JSON.parse(candidate);
                console.log('...found');
                return [res, firstOpen, firstClose + 1];
            }
            catch(e) {
                console.log('...failed');
            }
            firstClose = str.substr(0, firstClose).lastIndexOf('}');
        } while(firstClose > firstOpen);
        firstOpen = str.indexOf('{', firstOpen + 1);
    } while(firstOpen != -1);
}

var obj = {'foo': 'bar', xxx: '} me[ow]'};
var str = 'blah blah { not {json but here is json: ' + JSON.stringify(obj) + ' and here we have stuff that is } really } not ] json }} at all';
var result = extractJSON(str);
console.log('extracted object:', result[0]);
console.log('expected object :', obj);
console.log('did it work     ?', JSON.stringify(result[0]) == JSON.stringify(obj) ? 'yes!' : 'no');
console.log('surrounding str :', str.substr(0, result[1]) + '<JSON>' + str.substr(result[2]));

演示(在nodejs环境中执行,但也应在浏览器中运行): https://paste.aeum.net/show/81/

Demo (executed in the nodejs environment, but should work in a browser, too): https://paste.aeum.net/show/81/

这篇关于从文本中提取JSON的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆