如何匹配javascript正则表达式中的平衡分隔符? [英] How to match balanced delimiters in javascript regex?

查看:88
本文介绍了如何匹配javascript正则表达式中的平衡分隔符?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我原以为这个问题是不可能的;据我所知,Javascript的正则表达式风格没有递归插值,也没有漂亮的.NET平衡组功能。然而,就像 regex.alf.nu 上的问题12一样:匹配平衡的< > 。除非在集合中有其他一些模式我没有得到。

I would have thought this problem to be impossible; as far as I know, Javascript's regex flavor has niether recursive interpolation nor the nifty .NET balancing groups feature. Yet there it is, as problem 12 on regex.alf.nu: match balanced pairs of < and >. Unless there's some other pattern in the sets I'm not getting.

所以...这可能吗?如果是这样,怎么样?

So... is this possible? If so, how?

注意:


  1. 我知道这对于真正的正则表达式来说是不可能的,但基于挑战,似乎它必须有可能在Javascript的风格(至少是不规则的,以便有反向引用)。我只是不知道有什么功能可以让他们这样做。

  1. I know that this is impossible for true regular expressions, but based on the challenge it seems that it must be possible in Javascript's flavor (which is at least irregular enough to have backreferences). I just don't know of any feature that would let them do this.

没有其他代码 - 表单允许输入单个正则表达式,这是评估的针对页面上的测试字符串。我想,我可以尝试破解页面以打破正则表达式并进入原始JS,但这似乎不符合这一挑战的精神。

No other code - the form allows entry of a single regex, which is evaluated against the test strings on the page. I could try to crack the page to break out of the regex and into raw JS, I suppose, but that doesn't seem to be in the spirit of this challenge.

大卫问,这是测试字符串。较长的一个被截断了字符数,但问题的标题是平衡,那些完整的问题肯定支持匹配列具有平衡的< > 而not列没有。

Since David asked, here are the test strings. Longer ones have been truncated with a character count, but the problem is entitled "Balance" and the ones that are complete certainly support the hypothesis that the "match" column has balanced pairs of < and > while the "not" column doesn't.

Match all of these…

<<<<<>><<>>><<... [62 chars]
<<<<<>><>><<><... [110 chars]
<<<<<>><>><>><... [102 chars]
<<<<<>><>>>><<... [88 chars]
<<<<<>>><<<>><... [58 chars]
<<<<<>>><<><>>... [152 chars]
<<<<<>>><<>><<... [42 chars]
<<<<<>>><>><<<>>>><<>>
<<<<<>>>><<<<>... [102 chars]
<<<<<>>>><<<><... [30 chars]
<<<<<>>>><><<<... [66 chars]
<<<<<>>>><><<<... [124 chars]
<<<<<>>>><>><<>>
<<<<><<>>><<<>... [34 chars]
<<<<>><<<>>>><... [92 chars]
<<<<>>><<<<>><>><<<>>>>>
<<<<>>><<<><<>>><><<>>>><<>>
<<<<>>><<><<<>... [84 chars]
<<<<>>>><<<><<... [52 chars]
<<<><<<>>>><<<... [50 chars]
<<<><<><>>>>
<<<><>><<<>>>>
<<<>><<<><<>>>... [44 chars]
<<<>><><<<><>>... [48 chars]
<<<>>><<><<<<>>>><<><<<>>>>>
<<><<<<>><>>>>... [60 chars]
<<>>
<<>><<<<<>>>>>... [54 chars]
<<>><<<<>><<<>... [74 chars]
<>
<><>

and none of these…

<
<<<<<<>>><<><>>>>>><<>
<<<<<>>><>>><<<>>>><>>
<<<<<>>>>>>
<<<<>><<<<<><<>><><<<<
<<<>><<<<><><><><
<<<>>>><><<<><>
<<><<<<><<><<>>><<
<<><<<>>>>><<
<<>>><<<>>
<><<<>><<>>><<>
<><<>>><<<><>><<<>>><<>>>><
<><<>>><><<<>
<><>><>>><><<<... [36 chars]
<>><><<<><>
<>>>>>><<<>><<>><><
<>>>>>>><<<
>
><
><<<>><><<<><<
><<<>>>><><<<<><>>><<><><<
><<><<<<><<<<>>>><
><><><<<>>>>>
><><>>><>><>
><><>>>><>>>>>>><>>><>>
><>><<<<<>>
><>><><><<>><<>>><<
><>>><>>>>><<><<<><>><>><<<
>><<<><<<<<<><>><<
>><>>><<<><>>><><<>><<><><<
>>>><>><>>>><>>><>><><
>>>>><<<>>>


推荐答案

我不相信这在JavaScript中是可行的,虽然很难证明。例如,Java和PHP没有你提到的功能(递归插值,平衡组),但这个引人入胜的Stack Overflow答案显示如何匹配 a n b n 在这些语言中使用正则表达式。 (根据当前案例调整答案,Java正则表达式 ^(?:(?:<(?=< *(\\?+>)))+ \ 1)* $ 应该可以工作。 更正:不,它不是那么容易适应的。)但是答案取决于Java对占有<的支持/ em>量词?+ (比如除了你不能回溯它),而JavaScript没有'有这个。

I do not believe that this is possible in JavaScript, though it's hard to prove. For example, Java and PHP do not have the features that you mention (recursive interpolation, balancing groups), but this fascinating Stack Overflow answer shows how to match anbn using regexes in those languages. (Adapting that answer to the present case, the Java regex ^(?:(?:<(?=<*(\1?+>)))+\1)*$ should work. Correction: no, it's not that easily adapted.) But that answer depends on Java's support for the possessive quantifier ?+ (like ? except that you can't backtrack into it), and JavaScript doesn't have that.

这就是说,你可以写下这个来解决所引用的谜题:

That said, you can solve the referenced puzzle by writing this:

^(?:<(?:<(?:<(?:<(?:<(?:<(?:<>)*>)*>)*>)*>)*>)*>)*$

最多匹配七个嵌套级别。这是任何琴弦所拥有的最多,所以这就是你所需要的。 (该页面上的其他几个谜题建议你作弊,因为他们要求技术上不可能的东西;所以虽然优雅的解决方案显然更具吸引力,但没有理由认为存在一个。)

which matches up to seven levels of nesting. That's the most that any of the strings has, so it's all you need. (Several of the other puzzles at that page advise you to cheat because they're asking for something technically impossible; so while an elegant solution would obviously be more appealing, there's no reason to assume that one exists.)

这篇关于如何匹配javascript正则表达式中的平衡分隔符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆