Java Matcher 组:了解“(?:X|Y)"和“(?:X|Y)"之间的区别和“(?:X)|(?:Y)" [英] Java Matcher groups: Understanding The difference between "(?:X|Y)" and "(?:X)|(?:Y)"
问题描述
谁能解释一下:
- 为什么下面使用的两种模式给出了不同的结果?(回答如下)
- 为什么第二个例子给出的组数为 1,但说的是开始第 1 组的结尾是 -1?
public void testGroups() 抛出异常{String TEST_STRING = "Yes 之后是组 1 End";{图案 p;匹配器 m;字符串模式="(?:Yes|No)(.*)End";p=Pattern.compile(pattern);m=p.matcher(TEST_STRING);布尔 f=m.find();int count=m.groupCount();int start=m.start(1);int end=m.end(1);System.out.println("Pattern=" + pattern + "\t Found=" + f + " Group count=" + count +" 第 1 组开始=" + 开始 + " 第 1 组结束=" + 结束 );}{图案 p;匹配器 m;String pattern="(?:Yes)|(?:No)(.*)End";p=Pattern.compile(pattern);m=p.matcher(TEST_STRING);布尔 f=m.find();int count=m.groupCount();int start=m.start(1);int end=m.end(1);System.out.println("Pattern=" + pattern + "\t Found=" + f + " Group count=" + count +" 第 1 组开始=" + 开始 + " 第 1 组结束=" + 结束 );}}
给出以下输出:
Pattern=(?:Yes|No)(.*)End Found=true Group count=1 Start of group 1=9 End of group 1=21Pattern=(?:Yes)|(?:No)(.*)End Found=true Group count=1 组 1 开始=-1 组 1 结束=-1
总结一下,
1) 由于运算符的优先级规则,两种模式给出了不同的结果.
(?:Yes|No)(.*)End
匹配(是或否) 后跟 .*End(?:Yes)|(?:No)(.*)End
匹配 (Yes)或(否后跟 .*End)
2) 由于 Matcher
方法调用返回的结果的含义(不一定直观),第二个模式给出的组计数为 1,但开始和结束为 -1.>
Matcher.find()
如果找到匹配项,则返回 true.在您的情况下,匹配位于模式的(?:Yes)
部分.Matcher.groupCount()
返回模式中捕获组的数量无论捕获组是否实际参与了比赛.在你的例子中,只有模式的非捕获(?:Yes)
部分参与了匹配,但捕获(.*)
组仍然是模式的一部分,所以组数为 1.Matcher.start(n)
和Matcher.end(n)
返回 n 匹配的子序列的开始和结束索引第一个捕获组.在您的情况下,虽然找到了整体匹配,但(.*)
捕获组没有参与匹配,因此没有捕获子序列,因此结果为 -1.
3) (在评论中提出的问题.)为了确定有多少捕获组实际捕获了一个子序列,将 Matcher.start(n)
从 0 迭代到 Matcher.groupCount()
计算非 -1 结果的数量.(请注意,Matcher.start(0)
是代表整个模式的捕获组,出于您的目的,您可能希望将其排除.)
Can anyone explain:
- Why the two patterns used below give different results? (answered below)
- Why the 2nd example gives a group count of 1 but says the start and end of group 1 is -1?
public void testGroups() throws Exception
{
String TEST_STRING = "After Yes is group 1 End";
{
Pattern p;
Matcher m;
String pattern="(?:Yes|No)(.*)End";
p=Pattern.compile(pattern);
m=p.matcher(TEST_STRING);
boolean f=m.find();
int count=m.groupCount();
int start=m.start(1);
int end=m.end(1);
System.out.println("Pattern=" + pattern + "\t Found=" + f + " Group count=" + count +
" Start of group 1=" + start + " End of group 1=" + end );
}
{
Pattern p;
Matcher m;
String pattern="(?:Yes)|(?:No)(.*)End";
p=Pattern.compile(pattern);
m=p.matcher(TEST_STRING);
boolean f=m.find();
int count=m.groupCount();
int start=m.start(1);
int end=m.end(1);
System.out.println("Pattern=" + pattern + "\t Found=" + f + " Group count=" + count +
" Start of group 1=" + start + " End of group 1=" + end );
}
}
Which gives the following output:
Pattern=(?:Yes|No)(.*)End Found=true Group count=1 Start of group 1=9 End of group 1=21
Pattern=(?:Yes)|(?:No)(.*)End Found=true Group count=1 Start of group 1=-1 End of group 1=-1
To summarise,
1) The two patterns give different results because of the precedence rules of the operators.
(?:Yes|No)(.*)End
matches (Yes or No) followed by .*End(?:Yes)|(?:No)(.*)End
matches (Yes) or (No followed by .*End)
2) The second pattern gives a group count of 1 but a start and end of -1 because of the (not necessarily intuitive) meanings of the results returned by the Matcher
method calls.
Matcher.find()
returns true if a match was found. In your case the match was on the(?:Yes)
part of the pattern.Matcher.groupCount()
returns the number of capturing groups in the pattern regardless of whether the capturing groups actually participated in the match. In your case only the non capturing(?:Yes)
part of the pattern participated in the match, but the capturing(.*)
group was still part of the pattern so the group count is 1.Matcher.start(n)
andMatcher.end(n)
return the start and end index of the subsequence matched by the n th capturing group. In your case, although an overall match was found, the(.*)
capturing group did not participate in the match and so did not capture a subsequence, hence the -1 results.
3) (Question asked in comment.) In order to determine how many capturing groups actually captured a subsequence, iterate Matcher.start(n)
from 0 to Matcher.groupCount()
counting the number of non -1 results. (Note that Matcher.start(0)
is the capturing group representing the whole pattern, which you may want to exclude for your purposes.)
这篇关于Java Matcher 组:了解“(?:X|Y)"和“(?:X|Y)"之间的区别和“(?:X)|(?:Y)"的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!