普通EX pression解析JSON对象的数组? [英] Regular expression to parse an array of JSON objects?

查看:140
本文介绍了普通EX pression解析JSON对象的数组?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图解析JSON对象的数组转换为字符串在C#中的数组。我可以提取JSON对象的数组,但我不能数组字符串分割成单个对象的数组。

I'm trying to parse an array of JSON objects into an array of strings in C#. I can extract the array from the JSON object, but I can't split the array string into an array of individual objects.

我所拥有的是此测试字符串:

What I have is this test string:

string json = "{items:[{id:0,name:\"Lorem Ipsum\"},{id:1,name" 
            + ":\"Lorem Ipsum\"},{id:2,name:\"Lorem Ipsum\"}]}";

现在,我用下面的正前pressions现在拆分项目到单个对象。现在他们是2个独立的定期EX pressions,直到我解决这个问题,第二个:

Right now I'm using the following regular expressions right now to split the items into individual objects. For now they're 2 separate regular expressions until I fix the problem with the second one:

Regex arrayFinder = new Regex(@"\{items:\[(?<items>[^\]]*)\]\}"
                                 , RegexOptions.ExplicitCapture);
Regex arrayParser = new Regex(@"((?<items>\{[^\}]\}),?)+"
                                 , RegexOptions.ExplicitCapture);

arrayFinder 正则表达式的工作,我期望它的方式,但对于原因,我不明白,在 arrayParser 正则表达式不工作的。我只希望它做的是各个项目拆分为自己的字符串,所以我得到的列表是这样的:

The arrayFinder regex works the way I'd expect it but, for reasons I don't understand, the arrayParser regex doesn't work at all. All I want it to do is split the individual items into their own strings so I get a list like this:

{ID:0,名称:Lorem存有}
   {ID:1,名称:Lorem存有}
   {ID:2,名称:Lorem存有}

{id:0,name:"Lorem Ipsum"}
{id:1,name:"Lorem Ipsum"}
{id:2,name:"Lorem Ipsum"}

无论该列表是一个的String [] 阵列或集团匹配收集无所谓,但我难倒,如何获得对象分裂。使用上述声明的 arrayParser JSON 字符串,我已经试过这code,我认为会工作没有运气:

Whether this list is a string[] array or a Group or Match collection doesn't matter, but I'm stumped as to how to get the objects split. Using the arrayParser and the json string declared above, I've tried this code which I assumed would work with no luck:

string json = "{items:[{id:0,name:\"Lorem Ipsum\"},{id:1,name" 
            + ":\"Lorem Ipsum\"},{id:2,name:\"Lorem Ipsum\"}]}";

Regex arrayFinder = new Regex(@"\{items:\[(?<items>[^\]]*)\]\}"
                                 , RegexOptions.ExplicitCapture);
Regex arrayParser = new Regex(@"((?<items>\{[^\}]\}),?)+"
                                 , RegexOptions.ExplicitCapture);

string array = arrayFinder.Match(json).Groups["items"].Value;
// At this point the 'array' variable contains: 
// {id:0,name:"Lorem Ipsum"},{id:1,name:"Lorem Ipsum"},{id:2,name:"Lorem Ipsum"}

// I would have expected one of these 2 lines to return 
// the array of matches I'm looking for
CaptureCollection c = arrayParser.Match(array).Captures;
GroupCollection g = arrayParser.Match(array).Groups;

任何人都可以看到它是什么,我做错了什么?我完全停留在此。

Can anybody see what it is I'm doing wrong? I'm totally stuck on this.

推荐答案

平衡的括号是字面上不能用常规的前pressions处理语言的一个典型例子。 JSON本质上是平衡的括号再加上一堆其他的东西,由括号替换括号。在形式语言的层次的,JSON是一种上下文无关语言。普通EX pressions无法分析上下文无关语言。

Balanced parentheses are literally a textbook example of a language that cannot be processed with regular expressions. JSON is essentially balanced parentheses plus a bunch of other stuff, with the braces replaced by parens. In the hierarchy of formal languages, JSON is a context-free language. Regular expressions can't parse context-free languages.

有些系统提供扩展到正规EX pressions这有点儿,八九不离十处理平衡EX pressions。不过他们都难看黑客,他们都不可移植,和他们都最终错误的工具,这项工作。

Some systems offer extensions to regular expressions that kinda-sorta handle balanced expressions. However they're all ugly hacks, they're all unportable, and they're all ultimately the wrong tool for the job.

在专业工作中,你几乎总是使用现有的JSON解析器。如果你想推出自己的用于教育目的的话,我会建议开始与支持简单的算术语法+ - * /()。 (JSON有一些转义规则,虽然不复杂,会让你的第一次尝试难度比它需要)。基本上,你需要:

In professional work, you would almost always use an existing JSON parser. If you want to roll your own for educational purposes then I'd suggest starting with a simple arithmetic grammar that supports + - * / ( ). (JSON has some escaping rules which, while not complex, will make your first attempt harder than it needs to be.) Basically, you'll need to:

  1. 分解成语言符号的字母
  2. 将在这些符号方面的上下文无关文法thatrecognizes语言
  3. 语法转换为乔姆斯基范式,还是足够接近,使第5步轻松
  4. 写一个词法分析器将原始文本的输入拼音
  5. 写一个递归下降解析器,把你的词法分析器的输出,分析它,并产生某种输出

这是一个典型的第三年CS分配在几乎任何一所大学。

This is a typical third-year CS assignment at just about any university.

下一步是找出如何复杂JSON字符串你需要触发栈溢出的递归解析器。再来看一下其他类型的解析器可以写,你就会明白为什么有人谁拥有解析在现实世界中的上下文无关语言使用编写解析器的手像YACC或ANTLR的工具来代替。

The next step is to find out how complex a JSON string you need to trigger a stack overflow in your recursive parser. Then look at the other types of parsers that can be written, and you'll understand why anyone who has to parse a context-free language in the real world uses a tool like yacc or antlr instead of writing a parser by hand.

如果这是更多的学习不是你要找的人,那么你应该感到自由地去使用一种现成的,现成的JSON解析器,在线北京,你学到了一些东西重要和有用的:经常EX pressions的限制

If that's more learning than you were looking for then you should feel free to go use an off-the-shelf JSON parser, satisified that you learned something important and useful: the limits of regular expressions.

这篇关于普通EX pression解析JSON对象的数组?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆