正则表达式匹配JSON字符串 [英] Regex to match a JSON String

查看:1725
本文介绍了正则表达式匹配JSON字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在从头开始构建一个JSON验证器,但我完全坚持使用字符串部分。我希望构建一个与JSON.org上的以下序列相匹配的正则表达式:





到目前为止我的正则表达式是:

  / ^ \((?= \\ \\\)\\(\ | \ / | \\ | b | F |ñ| R | T | U [0-9A-F] {4}))* \ $ / 

它的标准与字符和空字符串后面的反斜杠匹配。但我'我不知道如何使用UNICODE部分。



是否有正则表达式匹配任何UNICODE字符专家或\或控制字符?它会匹配换行符还是水平制表符?



最后一个问题是因为正则表达式匹配字符串\ t,而不是(四个空格,但是这个想法是一个标签)。否则我将需要用它来扩展正则表达式,这不是问题,但我的猜测是水平制表符是UNICODE字符。



感谢Jaeger Kor,我现在有以下正则表达式:

  / ^ \((?= \\)\\(\\ \\ | \ / | \\ | b | F |ñ| R | T | U [0-9A-F] {4})| [^ \\] *)* \$ / 

它似乎是正确的,但有没有办法检查控制字符或这是不需要的因为它们出现在regular-expressions.info上的非可打印字符上?要验证的输入始终是来自textarea的文本。



更新:如果有人需要,正则表达式如下:

  / ^((((?= \\)\\([\\\ / bfnrt] | u [0-9a- fA-F] {4}))| [^\\\\\-\ x1F \ x7F] +)*)$ / 


解决方案

对于您的确切问题,请创建一个字符类

 #匹配任何不是\或
/ [^ \\] /


然后你可以在最后添加*以获得0或无限数量,或者1或无限数字+

  / [^ \\] * / 

  / [^ \\] + / 

下面还有以下内容,可在 https://找到搜索json时,在库选项卡下的regex101.com/

  /(?(DEFINE)
#注意一切都是原子的,如果它是有效的
#,JSON不需要回溯,这可以防止灾难性的回溯
(?< json>(?> \ s *(?& object)\s * | \ * *(?& array)\ s *))
(? <对象>(大于\?{\s *(大于?(安培;一对)(大于?\s *,\s *(&安培;?一对))*)\ s * \}))
(?< pair>(?>(?& STRING)\ * *:\ * *(?& value)))
(? <阵列>(大于?\ [\s *(大于?(安培;α值)(大于?\s *,\s *(&安培;?值))*)\ s * \]))
(?< value>(?> true | false | null |(?& STRING)|(?& NUMBER)|(?& object)|(? & array)))
(?< STRING>(?>(?> \\(?> [\\\ / bfnrt] | u [a-fA -F0-9] {4})| [^\\\\\ - \ x1F \ x7F] +)*))
(?< NUMBER>(?> - ? (大于0 | [1-9] [0-9] *?)???(大于?\。[0-9] +)(大于[EE] [+ - ] [0-9] +)?))

\ A(?& json)\z / x

这应该匹配y有效的json,你也可以在上面的网站上测试



编辑:



链接到正则表达式


I am building a JSON validator from scratch, but I am quite stuck with the string part. My hope was building a regex which would match the following sequence found on JSON.org:

My regex so far is:

/^\"((?=\\)\\(\"|\/|\\|b|f|n|r|t|u[0-9a-f]{4}))*\"$/

It does match the criteria with a backslash following by a character and an empty string. But I'm not sure how to use the UNICODE part.

Is there a regex to match any UNICODE character expert " or \ or control character? And will it match a newline or horizontal tab?

The last question is because the regex match the string "\t", but not " " (four spaces, but the idea is to be a tab). Otherwise I will need to expand the regex with it, which is not a problem, but my guess is the horizontal tab is a UNICODE character.

Thanks to Jaeger Kor, I now have the following regex:

/^\"((?=\\)\\(\"|\/|\\|b|f|n|r|t|u[0-9a-f]{4})|[^\\"]*)*\"$/

It appears to be correct, but is there any way to check for control characters or is this unneeded as they appear on the non-printable characters on regular-expressions.info? The input to validate is always text from a textarea.

Update: the regex is as following in case anyone needs it:

/^("(((?=\\)\\(["\\\/bfnrt]|u[0-9a-fA-F]{4}))|[^"\\\0-\x1F\x7F]+)*")$/

解决方案

For your exact question create a character class

# Matches any character that isn't a \ or "
/[^\\"]/

And then you can just add * on the end to get 0 or unlimited number of them or alternatively 1 or an unlimited number with +

/[^\\"]*/

or

/[^\\"]+/

Also there is this below, found at https://regex101.com/ under the library tab when searching for json

/(?(DEFINE)
# Note that everything is atomic, JSON does not need backtracking if it's valid
# and this prevents catastrophic backtracking
(?<json>(?>\s*(?&object)\s*|\s*(?&array)\s*))
(?<object>(?>\{\s*(?>(?&pair)(?>\s*,\s*(?&pair))*)?\s*\}))
(?<pair>(?>(?&STRING)\s*:\s*(?&value)))
(?<array>(?>\[\s*(?>(?&value)(?>\s*,\s*(?&value))*)?\s*\]))
(?<value>(?>true|false|null|(?&STRING)|(?&NUMBER)|(?&object)|(?&array)))
(?<STRING>(?>"(?>\\(?>["\\\/bfnrt]|u[a-fA-F0-9]{4})|[^"\\\0-\x1F\x7F]+)*"))
(?<NUMBER>(?>-?(?>0|[1-9][0-9]*)(?>\.[0-9]+)?(?>[eE][+-]?[0-9]+)?))
)
\A(?&json)\z/x

This should match any valid json, you can also test it at the website above

EDIT:

Link to the regex

这篇关于正则表达式匹配JSON字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆