正则表达式 - 捕获重复组 [英] Regex - Capturing a Repeated Group

查看:182
本文介绍了正则表达式 - 捕获重复组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

好吧,我读过的教程和炒我的头太多要能看清楚了。

Alright, I've read the tutorials and scrambled my head too much to be able to see clearly now.

我试图捕捉参数和函数签名的类型信息。因此,考虑签名是这样的:

I'm trying to capture parameters and their type info from a function signature. So given a signature like this:

function(/*string*/a,b,c)

我想要得到的部分是这样的:

I want to get the parts like this:

type: string
param:a
param:b
param:c

这是好太多:

type: string
param:a
type: null (or whitespace)
param:b
type: null (or whitespace)
param:c

于是我想出了这个正则表达式这是做重复采集的常见错误(我已经明确捕捉开启):

So I came up with this regex which is doing the common mistake of repeating the capture (I've explicit capture turned on):

function\(((\/\*(?<type>[a-zA-Z]+)\*\/)?(?<param>[0-9a-zA-Z_$]+),?)*\)

但问题是,我不能纠正错误。 :(请帮帮忙!

Problem is, I can't correct the mistake. :(. Please help!

推荐答案

通常情况下,你需要两个步骤获得的所有数据。
首先,匹配/验证整个功能:

Generally, you'd need two steps to get all data.
First, match/validate the whole function:

function\((?<parameters>((\/\*[a-zA-Z]+\*\/)?[0-9a-zA-Z_$]+,?)*)\)

请注意,现在你有一个参数组所有参数。你可以搭配一些样式再次得到所有的参数匹配,或在这种情况下,拆分对

Note that now you have a parameters group with all parameters. You can match some of the pattern again to get all matches of parameters, or in this case, split on ,.

如果你使用的.Net,不管怎样,你很幸运。净保留每个组的所有捕获的全部记录,这样你就可以使用集合:

If you're using .Net, by any chance, you're in luck. .Net keeps full record of all captures of each group, so you can use the collection:

match.Groups["param"].Captures

一些注意事项:

Some notes:

  • 如果你想捕捉多个类型,您一定要空场比赛,这样你就可以很容易地结合比赛(虽然你可以进行排序,但1:1的捕捉整洁)。在这种情况下,你想要的可选组的的捕获的组:<?code>(小于型&GT;(\ / \ * [A-ZA-Z] + \ * \ / )?)
  • 您不必逃避斜线净模式 - / 有没有什么特别的意义有(C#/。网络不具有正则表达式的分隔符)
  • If you do want to capture more than one type, you definitely want empty matches, so you can easily combine the matches (though you can sort, but a 1-to-1 capture is neater). In that case, you want the optional group inside your captured group: (?<type>(\/\*[a-zA-Z]+\*\/)?)
  • You don't have to escape slashes in .Net patterns - / has no special meaning there (C#/.Net doesn't have regex delimiters).

下面是一个使用捕获的一个例子。此外,主要的一点是保持在关系类型参数:要捕捉空的类型,所以你不'T失去计数。
图案:

Here's an example of using the captures. Again, the main point is maintaining the relation between type and param: you want to capture empty types, so you don't lose count.
Pattern:

function
\(
(?:
    (?:
        /\*(?<type>[a-zA-Z]+)\*/    # type within /* */
        |                           # or
        (?<type>)                   # capture an empty type.
    )
    (?<param>
        [0-9a-zA-Z_$]+
    )
    (?:,|(?=\s*\)))     # mandatory comma, unless before the last ')'
)*
\)

code:

Code:

Match match = Regex.Match(s, pattern, RegexOptions.IgnorePatternWhitespace);
CaptureCollection types = match.Groups["type"].Captures;
CaptureCollection parameters = match.Groups["param"].Captures;
for (int i = 0; i < parameters.Count; i++)
{
    string parameter = parameters[i].Value;
    string type = types[i].Value;
    if (String.IsNullOrEmpty(type))
        type = "NO TYPE";
    Console.WriteLine("Parameter: {0}, Type: {1}", parameter, type);
}

这篇关于正则表达式 - 捕获重复组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆