将JSON与正则表达式匹配 [英] Matching JSON with a regular expression

查看:453
本文介绍了将JSON与正则表达式匹配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含许多对象文字的JavaScript文件:

I have a JavaScript file containing many object literals:

// lots of irrelevant code
oneParticularFunction({
    key1: "string value",
    key2: 12345,
    key3: "strings which may contain ({ arbitrary characters })"
});
// more irrelevant code

我需要编写一些Python代码来提取这些文字.

I need to write some Python code to extract these literals.

我的第一个尝试是正则表达式oneParticularFunction\(\{(.*?)\}\);.但是,如果文字包含})",则此操作将失败.

My first attempt was a regular expression oneParticularFunction\(\{(.*?)\}\);. But this fails if the literal contains a "})".

由于我知道对象在有效的JavaScript文件中将是有效的JSON(匹配的引号,花括号等),是否有更优雅的方法来提取它们?

Since I know the objects will be valid JSON (matched quotes, braces, etc) in a valid JavaScript file, is there a more elegant way to extract them?

(换句话说,困难在于删除我不关心的所有其他JavaScript代码.)

(In other words, the difficulty is removing all the other JavaScript code I don't care about.)

最后,我对不包含子对象的任何对象使用了正则表达式...

In the end, I used a regular expression for any objects which don't contain sub-objects...

oneParticularFunction\((\{([^"}]*"[^"]*"[^"}]*)*?[^"]*?\})\);

...并用手跟踪打开/关闭大括号,以查找嵌套的任何内容.

...and tracked open/close braces by hand for anything with nesting.

推荐答案

为什么不编写一个读取{的状态机,并在每个{上增加一个计数器,并在每个}中减少一个计数器,所以当它再次达到0时,请使用所有字符在中间,并使用python的json解析器检查其是否有效?这样,您就可以从语法错误中受益,而不必从正则表达式中进行简单匹配,而无需匹配(请记住python是{free,所以不可能出现误报).

Why not writing a state machine that reads { and increments a counter on every { and decrements it with every } so when it reaches 0 again, take all the characters in the middle and use the json parser from python to check if it is valid or not? on that way, you can get the benefit of syntactical errors instead of a simple match no match from the regex (remember python is { free so false positives are impossible).

这篇关于将JSON与正则表达式匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆