什么是间和QUOT的差值;组"和"捕获"在.NET正EX pressions? [英] What's the difference between "groups" and "captures" in .NET regular expressions?
问题描述
我对什么是组和捕捉之间的区别是,当涉及到.NET的正前pression语言有点模糊。请考虑下面的C#code:
I'm a little fuzzy on what the difference between a "group" and a "capture" are when it comes to .NET's regular expression language. Consider the following C# code:
MatchCollection matches = Regex.Matches("{Q}", @"^\{([A-Z])\}$");
我预计这将导致对字母Q一个捕捉,但如果我打印的返回属性MatchCollection
,我看到:
matches.Count: 1
matches[0].Value: {Q}
matches[0].Captures.Count: 1
matches[0].Captures[0].Value: {Q}
matches[0].Groups.Count: 2
matches[0].Groups[0].Value: {Q}
matches[0].Groups[0].Captures.Count: 1
matches[0].Groups[0].Captures[0].Value: {Q}
matches[0].Groups[1].Value: Q
matches[0].Groups[1].Captures.Count: 1
matches[0].Groups[1].Captures[0].Value: Q
究竟是怎么回事?据我所知,还有一个捕获了整场比赛,但如何在群体进来吗?为什么不比赛[0] .Captures
包括捕获的字母Q?
What exactly is going on here? I understand that there's also a capture for the entire match, but how do the groups come in? And why doesn't matches[0].Captures
include the capture for the letter 'Q'?
推荐答案
您不会是第一谁是模糊了。这里是著名的杰弗里·弗里德尔有什么看法(页437 +):
You won't be the first who's fuzzy about it. Here's what the famous Jeffrey Friedl has to say about it (pages 437+):
根据您的看法,它要么增加 到一个有趣的新维度 匹配的结果,或者增加了困惑和 膨胀。
Depending on your view, it either adds an interesting new dimension to the match results, or adds confusion and bloat.
和进一步上:
一组之间的主要区别 对象和一个捕捉对象是 每个组对象都包含一个 收集捕获重新presenting 所有的中间的的匹配 组在比赛期间,以及 所匹配的组最后文本。
The main difference between a Group object and a Capture object is that each Group object contains a collection of Captures representing all the intermediary matches by the group during the match, as well as the final text matched by the group.
和几页之后,这是他的结论是:
And a few pages later, this is his conclusion:
在过去获得了.NET 文档和实际 了解了这些对象添加, 我得对他们百感交集。上 一方面,这是一个有趣 创新[..]另一方面,它 似乎添加效率负担[..] 的,将不使用的官能 在大多数情况下的
After getting past the .NET documentation and actually understanding what these objects add, I've got mixed feelings about them. On one hand, it's an interesting innovation [..] on the other hand, it seems to add an efficiency burden [..] of a functionality that won't be used in the majority of cases
在换句话说:他们是非常相似的,但偶尔,因为它发生时,你会发现他们使用。在你成长的另外一个花白胡子,你甚至可以让酷爱的捕捉...
In other words: they are very similar, but occasionally and as it happens, you'll find a use for them. Before you grow another grey beard, you may even get fond of the Captures...
由于没有上述情况,也没有发生的事情在其他帖子说似乎真的来回答你的问题,考虑以下。想捕捉的是一种历史跟踪器。当正则表达式,使他的比赛,它通过字符串由左到右(忽略回溯了一会儿),它遇到一个匹配的捕获括号的时候,它会存储在 $ X
(x是任意数字),比方说 $ 1
。
Since neither the above, nor what's said in the other post really seems to answer your question, consider the following. Think of Captures as a kind of history tracker. When the regex makes his match, it goes through the string from left to right (ignoring backtracking for a moment) and when it encounters a matching capturing parentheses, it will store that in $x
(x being any digit), let's say $1
.
普通的正则表达式引擎,当捕获括号要重复,就会扔掉电流 $ 1
,并与新的值将取代。没有.NET,这将让这段历史,并将其放在捕获[0]
。
Normal regex engines, when the capturing parentheses are to be repeated, will throw away the current $1
and will replace it with the new value. Not .NET, which will keep this history and places it in Captures[0]
.
如果我们改变你的正则表达式如下所示:
If we change your regex to look as follows:
MatchCollection matches = Regex.Matches("{Q}{R}{S}", @"(\{[A-Z]\})+");
您会发现,第一个集团
将有一个捕获
(第一个总是被整场比赛组,即等于 $ 1,0
),第二组将举行 {S}
,即只在最后一个匹配的组。然而,这里的渔获物,如果你想找到另外两个锁扣,他们在捕获
,其中包含所有的中介捕获了 {Q }
{R}
和 {S}
。
you will notice that the first Group
will have one Captures
(the first group always being the whole match, i.e., equal to $0
) and the second group will hold {S}
, i.e. only the last matching group. However, and here's the catch, if you want to find the other two catches, they're in Captures
, which contains all intermediary captures for {Q}
{R}
and {S}
.
如果你有没有想过你怎么能得到从多个捕获,这只能说明过去的比赛已经明显出现字符串中的单个捕获,您必须使用捕获
。
If you ever wondered how you could get from the multiple-capture, which only shows last match to the individual captures that are clearly there in the string, you must use Captures
.
你的最后一个问题的最后一句话:总比赛总是有一个总的捕捉,不要混用,与各分组。捕获是唯一感兴趣的内部组的。
这篇关于什么是间和QUOT的差值;组"和"捕获"在.NET正EX pressions?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!