Java Matcher 组:了解“(?:X|Y)"和“(?:X|Y)"之间的区别和“(?:X)|(?:Y)" [英] Java Matcher groups: Understanding The difference between "(?:X|Y)" and "(?:X)|(?:Y)"

查看:53
本文介绍了Java Matcher 组:了解“(?:X|Y)"和“(?:X|Y)"之间的区别和“(?:X)|(?:Y)"的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

谁能解释一下:

  1. 为什么下面使用的两种模式给出了不同的结果?(回答如下)
  2. 为什么第二个例子给出的组数为 1,但说的是开始第 1 组的结尾是 -1?

 public void testGroups() 抛出异常{String TEST_STRING = "Yes 之后是组 1 End";{图案 p;匹配器 m;字符串模式="(?:Yes|No)(.*)End";p=Pattern.compile(pattern);m=p.matcher(TEST_STRING);布尔 f=m.find();int count=m.groupCount();int start=m.start(1);int end=m.end(1);System.out.println("Pattern=" + pattern + "\t Found=" + f + " Group count=" + count +" 第 1 组开始=" + 开始 + " 第 1 组结束=" + 结束 );}{图案 p;匹配器 m;String pattern="(?:Yes)|(?:No)(.*)End";p=Pattern.compile(pattern);m=p.matcher(TEST_STRING);布尔 f=m.find();int count=m.groupCount();int start=m.start(1);int end=m.end(1);System.out.println("Pattern=" + pattern + "\t Found=" + f + " Group count=" + count +" 第 1 组开始=" + 开始 + " 第 1 组结束=" + 结束 );}}

给出以下输出:

Pattern=(?:Yes|No)(.*)End Found=true Group count=1 Start of group 1=9 End of group 1=21Pattern=(?:Yes)|(?:No)(.*)End Found=true Group count=1 组 1 开始=-1 组 1 结束=-1

解决方案

总结一下,

1) 由于运算符的优先级规则,两种模式给出了不同的结果.

  • (?:Yes|No)(.*)End 匹配(是或否) 后跟 .*End
  • (?:Yes)|(?:No)(.*)End 匹配 (Yes)或(否后跟 .*End)

2) 由于 Matcher 方法调用返回的结果的含义(不一定直观),第二个模式给出的组计数为 1,但开始和结束为 -1.

  • Matcher.find() 如果找到匹配项,则返回 true.在您的情况下,匹配位于模式的 (?:Yes) 部分.
  • Matcher.groupCount() 返回模式中捕获组的数量无论捕获组是否实际参与了比赛.在你的例子中,只有模式的非捕获 (?:Yes) 部分参与了匹配,但捕获 (.*) 组仍然是模式的一部分,所以组数为 1.
  • Matcher.start(n)Matcher.end(n) 返回 n 匹配的子序列的开始和结束索引第一个捕获组.在您的情况下,虽然找到了整体匹配,但 (.*) 捕获组没有参与匹配,因此没有捕获子序列,因此结果为 -1.

3) (在评论中提出的问题.)为了确定有多少捕获组实际捕获了一个子序列,将 Matcher.start(n) 从 0 迭代到 Matcher.groupCount() 计算非 -1 结果的数量.(请注意,Matcher.start(0) 是代表整个模式的捕获组,出于您的目的,您可能希望将其排除.)

Can anyone explain:

  1. Why the two patterns used below give different results? (answered below)
  2. Why the 2nd example gives a group count of 1 but says the start and end of group 1 is -1?

 public void testGroups() throws Exception
 {
  String TEST_STRING = "After Yes is group 1 End";
  {
   Pattern p;
   Matcher m;
   String pattern="(?:Yes|No)(.*)End";
   p=Pattern.compile(pattern);
   m=p.matcher(TEST_STRING);
   boolean f=m.find();
   int count=m.groupCount();
   int start=m.start(1);
   int end=m.end(1);

   System.out.println("Pattern=" + pattern + "\t Found=" + f + " Group count=" + count + 
     " Start of group 1=" + start + " End of group 1=" + end );
  }

  {
   Pattern p;
   Matcher m;

   String pattern="(?:Yes)|(?:No)(.*)End";
   p=Pattern.compile(pattern);
   m=p.matcher(TEST_STRING);
   boolean f=m.find();
   int count=m.groupCount();
   int start=m.start(1);
   int end=m.end(1);

   System.out.println("Pattern=" + pattern + "\t Found=" + f + " Group count=" + count + 
     " Start of group 1=" + start + " End of group 1=" + end );
  }

 }

Which gives the following output:

Pattern=(?:Yes|No)(.*)End  Found=true Group count=1 Start of group 1=9 End of group 1=21
Pattern=(?:Yes)|(?:No)(.*)End  Found=true Group count=1 Start of group 1=-1 End of group 1=-1

解决方案

To summarise,

1) The two patterns give different results because of the precedence rules of the operators.

  • (?:Yes|No)(.*)End matches (Yes or No) followed by .*End
  • (?:Yes)|(?:No)(.*)End matches (Yes) or (No followed by .*End)

2) The second pattern gives a group count of 1 but a start and end of -1 because of the (not necessarily intuitive) meanings of the results returned by the Matcher method calls.

  • Matcher.find() returns true if a match was found. In your case the match was on the (?:Yes) part of the pattern.
  • Matcher.groupCount() returns the number of capturing groups in the pattern regardless of whether the capturing groups actually participated in the match. In your case only the non capturing (?:Yes) part of the pattern participated in the match, but the capturing (.*) group was still part of the pattern so the group count is 1.
  • Matcher.start(n) and Matcher.end(n) return the start and end index of the subsequence matched by the n th capturing group. In your case, although an overall match was found, the (.*) capturing group did not participate in the match and so did not capture a subsequence, hence the -1 results.

3) (Question asked in comment.) In order to determine how many capturing groups actually captured a subsequence, iterate Matcher.start(n) from 0 to Matcher.groupCount() counting the number of non -1 results. (Note that Matcher.start(0) is the capturing group representing the whole pattern, which you may want to exclude for your purposes.)

这篇关于Java Matcher 组:了解“(?:X|Y)"和“(?:X|Y)"之间的区别和“(?:X)|(?:Y)"的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆