捕获使用.NET正则表达式匹配平衡内项目 [英] Capturing inner items using .net Regex Balanced Matching

查看：131 发布时间：2015/11/26 21:56:56 .net regex

本文介绍了捕获使用.NET正则表达式匹配平衡内项目的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我发现在平衡匹配以下资源.NET的正则表达式：

I have found the following resources on Balanced Matching for .net Regexes:

http://weblogs.asp.net/whaggard/存档/ 2005/02/20 / 377025.aspx
http://blogs.msdn.com/bclteam/存档/ 2005/03/15 / 396452.aspx
<一个href="http://msdn.microsoft.com/en-us/library/bs2twtah%28VS.85%29.aspx#BalancingGroupDefinitionExample" rel="nofollow">http://msdn.microsoft.com/en-us/library/bs2twtah%28VS.85%29.aspx#BalancingGroupDefinitionExample

http://weblogs.asp.net/whaggard/archive/2005/02/20/377025.aspx
http://blogs.msdn.com/bclteam/archive/2005/03/15/396452.aspx
http://msdn.microsoft.com/en-us/library/bs2twtah%28VS.85%29.aspx#BalancingGroupDefinitionExample

这是我看过这些，下面的例子应该工作：

From what I have read in these, the following example should work:

这个正则表达式应该找到一个a任何地方的尖括号组内，不管有多深。它应该匹配＆LT; A＆GT; ，＆LT;＆LT; A＆GT;＆GT; ，＆LT; A＆LT;＆GT;＆GT; ，＆LT;＆LT;＆gt;在＆GT; ，＆LT;＆LT;＆GT;＆LT; A＆GT;＆GT; 等

This regex should find an "a" anywhere within an angle-bracket group, no matter how deep. It should match "<a>", "<<a>>", "<a<>>", "<<>a>", "<<><a>>", etc.

(?<=
    ^
    (
    	(
    		<(?<Depth>)
    		|
    		>(?<-Depth>)
    	)
    	[^<>]*?
    )+?
)
(?(Depth)a|(?!))

匹配的一个字符串＆LT;＆LT;> A>

matching on the "a" in the string "<<>a>"

虽然会为字符串＆LT; A＆LT;＆GT;＆GT; 和＆LT;＆LT; A＆GT;＆GT; ，我不能让它匹配一个a是继>

While it will work for strings "<a<>>" and "<<a>>", I can't get it to match an "a" that is following a ">".

根据我看过的解释，前两个＆LT;S应该增加深度的两倍，那么第一个>应该递减一次。在这一点上，（（深度）一个？|（？！））应执行是的选项，但正则表达式甚至从来没有让在这里

According to the explanations I have read, the first two "<"s should increment Depth twice, then the first ">" should decrement it once. At this point, (?(Depth)a|(?!)) should perform the "yes" option, but the regex never even makes it here.

考虑下面的正则表达式，这使得没有这样的检查，仍然不匹配字符串中的问题：

Consider the following regex, which makes no such check and still fails to match the string in question:

(?<=
    ^
    (
    	(
    		<(?<Depth>)
    		|
    		>(?<-Depth>)
    	)
    	[^<>]*?
    )+?
)
a

我缺少的东西，或者是正则表达式引擎的工作不正确？

Am I missing something, or is the regex engine working incorrectly?

推荐答案

如果你想找到每个'A'这是一个平衡的一对尖括号里面的，我会建议这种方法：

If you want to find every 'a' that's inside a balanced pair of angle brackets, I would suggest this approach:

Regex r = new Regex(@"
    <
      (?>
         [^<>a]+
       |
         (a)
       |
         <(?<N>)
       |
         >(?<-N>)
      )+
    (?(N)(?!))
    >
", RegexOptions.IgnorePatternWhitespace);
string target = @"012a<56a8<0a2<4a6a>>012a<56789a>23456a";
foreach (Match m in r.Matches(target))
{
  Console.WriteLine("{0}, {1}", m.Index, m.Value);
  foreach (Capture c in m.Groups[1].Captures)
  {
    Console.WriteLine("{0}, {1}", c.Index, c.Value);
  }
}

结果：

9, <0a2<4a6a>>
11, a
15, a
17, a
24, <56789a>
30, a

而不是摆弄有条件的，它会开始，整个支架分隔（分）字符串捕获任何 A 的它可能包含匹配，在这个过程中。不像你的方法，它可以采摘任意数量的括号内的子串出一个更大的字符串，以及任何数量的 A 的出每个子字符串。

Instead of mucking about with the conditional, it goes ahead and matches the whole bracket-delimited (sub)string, in the process capturing any a's it might contain. Unlike your approach, it can pluck any number of bracketed substrings out of a larger string, and any number of a's out of each substring.

这篇关于捕获使用.NET正则表达式匹配平衡内项目的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

捕获使用.NET正则表达式匹配平衡内项目 [英] Capturing inner items using .net Regex Balanced Matching

问题描述

推荐答案

相关文章

C#/.NET最新文章

热门教程

热门工具

登录关闭

捕获使用.NET正则表达式匹配平衡内项目 [英] Capturing inner items using .net Regex Balanced Matching

问题描述

推荐答案

相关文章

C#/.NET最新文章

热门教程

热门工具

登录 关闭

登录关闭