捕获外部父组同时忽略内部父组 [英] Capture outer paren groups while ignoring inner paren groups

查看:41
本文介绍了捕获外部父组同时忽略内部父组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 C# 和正则表达式,尝试捕获外部括号组,同时忽略内部括号组.我有遗留生成的文本文件,其中包含数千个字符串结构,如下所示:

I'm using C# and regex, trying capture outer paren groups while ignoring inner paren groups. I have legacy-generated text files containing thousands of string constructions like the following:

([txtData] of COMPOSITE
(dirty FALSE)
(composite [txtModel])
(view [star3])
(creationIndex 0)
(creationProps )
(instanceNameSpecified FALSE)
(containsObject nil)
(sName txtData)
(txtDynamic FALSE)
(txtSubComposites )
(txtSubObjects )
(txtSubConnections )
)

([txtUI] of COMPOSITE
(dirty FALSE)
(composite [txtModel])
(view [star2])
(creationIndex 0)
(creationProps )
(instanceNameSpecified FALSE)
(containsObject nil)
(sName ApplicationWindow)
(txtDynamic FALSE)
(txtSubComposites )
(txtSubObjects )
(txtSubConnections )
)

([star38] of COMPOSITE
(dirty FALSE)
(composite [txtUI])
(view [star39])
(creationIndex 26)
(creationProps composite [txtUI] sName Bestellblatt)
(instanceNameSpecified TRUE)
(containsObject COMPOSITE)
(sName Bestellblatt)
(txtDynamic FALSE)
(txtSubComposites )
(txtSubObjects )
(txtSubConnections )
)

我正在寻找一个可以捕获上面示例中的 3 个分组的正则表达式,这是我迄今为止尝试过的:

I am looking for a regex that will capture the 3 groupings in the example above, and here is what I have tried so far:

Regex regex = new Regex(@"\((.*?)\)");
return regex.Matches(str);

上述正则表达式的问题在于它会找到内部括号分组,例如dirty FALSEcomposite [txtModel].但是我希望它匹配的是每个外部分组,例如上面显示的 3.外部分组的定义很简单:

The problem with the regex above is that it finds inner paren groupings such as dirty FALSE and composite [txtModel]. But what I want it to match is each of the outer groupings, such as the 3 shown above. The definition of an outer grouping is simple:

  1. 打开括号是文件中的第一个字符,或者它跟在换行符和/或回车符之后.
  2. 结束括号要么是文件中的最后一个字符,要么后面跟着一个换行符或回车符.

我希望正则表达式模式忽略所有不遵守上面数字 1 和 2 的父分组.通过忽略"我的意思是它们不应该被视为匹配 - 但它们应该作为外部分组匹配的一部分返回.

I want the regex pattern to ignore all paren-groupings that don't obey numbers 1 and 2 above. By "ignore" I mean that they shouldn't be seen as a match - but they should be returned as part of the outer grouping match.

因此,为了实现我的目标,当我的 C# 正则表达式与上面的示例运行时,我应该返回一个 regex MatchCollection 恰好有 3 个匹配项,如上所示.

So, for my objective to be met, when my C# regex runs against the example above, I should get back a regex MatchCollection with exactly 3 matches, just as shown above.

是怎么做的?(提前致谢.)

How is it done? (Thanks in advance.)

推荐答案

您可以通过 实现平衡组.

这是一个匹配外括号的演示.

Here is a demo to match outer brackets.

string sentence = @"([txtData] of COM ..."; // your text

string pattern = @"\((?>\((?<c>)|[^()]+|\)(?<-c>))*(?(c)(?!))\)";
Regex rgx = new Regex(pattern);

foreach (Match match in rgx.Matches(sentence))
{
    Console.WriteLine(match.Value);
    Console.WriteLine("--------");
}

这篇关于捕获外部父组同时忽略内部父组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆