什么是间和QUOT的差值;组"和"捕获"在.NET正EX pressions? [英] What's the difference between "groups" and "captures" in .NET regular expressions?

查看:179
本文介绍了什么是间和QUOT的差值;组"和"捕获"在.NET正EX pressions?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对什么是组和捕捉之间的区别是,当涉及到.NET的正前pression语言有点模糊。请考虑下面的C#code:

I'm a little fuzzy on what the difference between a "group" and a "capture" are when it comes to .NET's regular expression language. Consider the following C# code:

MatchCollection matches = Regex.Matches("{Q}", @"^\{([A-Z])\}$");

我预计这将导致对字母Q一个捕捉,但如果我打印的返回属性MatchCollection ,我看到:

matches.Count: 1
matches[0].Value: {Q}
        matches[0].Captures.Count: 1
                matches[0].Captures[0].Value: {Q}
        matches[0].Groups.Count: 2
                matches[0].Groups[0].Value: {Q}
                matches[0].Groups[0].Captures.Count: 1
                        matches[0].Groups[0].Captures[0].Value: {Q}
                matches[0].Groups[1].Value: Q
                matches[0].Groups[1].Captures.Count: 1
                        matches[0].Groups[1].Captures[0].Value: Q

究竟是怎么回事?据我所知,还有一个捕获了整场比赛,但如何在群体进来吗?为什么不比赛[0] .Captures 包括捕获的字母Q?

What exactly is going on here? I understand that there's also a capture for the entire match, but how do the groups come in? And why doesn't matches[0].Captures include the capture for the letter 'Q'?

推荐答案

您不会是第一谁是模糊了。这里是著名的杰弗里·弗里德尔有什么看法(页437 +):

You won't be the first who's fuzzy about it. Here's what the famous Jeffrey Friedl has to say about it (pages 437+):

根据您的看法,它要么增加   到一个有趣的新维度   匹配的结果,或者增加了困惑和   膨胀。

Depending on your view, it either adds an interesting new dimension to the match results, or adds confusion and bloat.

和进一步上:

一组之间的主要区别   对象和一个捕捉对象是   每个组对象都包含一个   收集捕获重新presenting   所有的中间的的匹配   组在比赛期间,以及   所匹配的组最后文本。

The main difference between a Group object and a Capture object is that each Group object contains a collection of Captures representing all the intermediary matches by the group during the match, as well as the final text matched by the group.

和几页之后,这是他的结论是:

And a few pages later, this is his conclusion:

在过去获得了.NET   文档和实际   了解了这些对象添加,   我得对他们百感交集。上   一方面,这是一个有趣   创新[..]另一方面,它   似乎添加效率负担[..]   的,将不使用的官能   在大多数情况下的

After getting past the .NET documentation and actually understanding what these objects add, I've got mixed feelings about them. On one hand, it's an interesting innovation [..] on the other hand, it seems to add an efficiency burden [..] of a functionality that won't be used in the majority of cases

在换句话说:他们是非常相似的,但偶尔,因为它发生时,你会发现他们使用。在你成长的另外一个花白胡子,你甚至可以让酷爱的捕捉...

In other words: they are very similar, but occasionally and as it happens, you'll find a use for them. Before you grow another grey beard, you may even get fond of the Captures...

由于没有上述情况,也没有发生的事情在其他帖子说似乎真的来回答你的问题,考虑以下。想捕捉的是一种历史跟踪器。当正则表达式,使他的比赛,它通过字符串由左到右(忽略回溯了一会儿),它遇到一个匹配的捕获括号的时候,它会存储在 $ X (x是任意数字),比方说 $ 1

Since neither the above, nor what's said in the other post really seems to answer your question, consider the following. Think of Captures as a kind of history tracker. When the regex makes his match, it goes through the string from left to right (ignoring backtracking for a moment) and when it encounters a matching capturing parentheses, it will store that in $x (x being any digit), let's say $1.

普通的正则表达式引擎,当捕获括号要重复,就会扔掉电流 $ 1 ,并与新的值将取代。没有.NET,这将让这段历史,并将其放在捕获[0]

Normal regex engines, when the capturing parentheses are to be repeated, will throw away the current $1 and will replace it with the new value. Not .NET, which will keep this history and places it in Captures[0].

如果我们改变你的正则表达式如下所示:

If we change your regex to look as follows:

MatchCollection matches = Regex.Matches("{Q}{R}{S}", @"(\{[A-Z]\})+");

您会发现,第一个集团将有一个捕获(第一个总是被整场比赛组,即等于 $ 1,0 ),第二组将举行 {S} ,即只在最后一个匹配的组。然而,这里的渔获物,如果你想找到另外两个锁扣,他们在捕获,其中包含所有的中介捕获了 {Q } {R} {S}

you will notice that the first Group will have one Captures (the first group always being the whole match, i.e., equal to $0) and the second group will hold {S}, i.e. only the last matching group. However, and here's the catch, if you want to find the other two catches, they're in Captures, which contains all intermediary captures for {Q} {R} and {S}.

如果你有没有想过你怎么能得到从多个捕获,这只能说明过去的比赛已经明显出现字符串中的单个捕获,您必须使用捕获

If you ever wondered how you could get from the multiple-capture, which only shows last match to the individual captures that are clearly there in the string, you must use Captures.

你的最后一个问题的最后一句话:总比赛总是有一个总的捕捉,不要混用,与各分组。捕获是唯一感兴趣的内部组的。

这篇关于什么是间和QUOT的差值;组"和"捕获"在.NET正EX pressions?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆