有效查找正则表达式的所有重叠匹配 [英] Efficiently finding all overlapping matches for a regular expression

查看:320
本文介绍了有效查找正则表达式的所有重叠匹配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是与Java正则表达式匹配的所有重叠子字符串的后续操作.

有没有一种方法可以使这段代码更快?

Is there a way to make this code faster?

public static void allMatches(String text, String regex)
  {
    for (int i = 0; i < text.length(); ++i) {
      for (int j = i + 1; j <= text.length(); ++j) {
        String positionSpecificPattern = "((?<=^.{"+i+"})("+regex+")(?=.{"+(text.length() - j)+"}$))";
        Matcher m = Pattern.compile(positionSpecificPattern).matcher(text);

        if (m.find()) 
        {   
          System.out.println("Match found: \"" + (m.group()) + "\" at position [" + i + ", " + j + ")");
        }   
      }   
    }   
  }

推荐答案

在另一个问题中,您提到了Matcher的region()方法,但是您没有充分利用它.之所以如此有价值,是因为锚点将在该区域的边界处匹配,就好像它们是独立字符串的边界一样.假设您已经设置了useAnchoringBounds()选项,但这是默认设置.

In the other question you mentioned Matcher's region() method, but you weren't making full use of it. What makes it so valuable is that the anchors will match at the region's bounds as if they were the bounds of a standalone string. That's assuming you've got the useAnchoringBounds() option set, but that's the default setting.

public static void allMatches(String text, String regex)
{
  Matcher m = Pattern.compile(regex).matcher(text);
  int end = text.length();
  for (int i = 0; i < end; ++i)
  {
    for (int j = i + 1; j <= end; ++j) 
    {
      m.region(i, j);

      if (m.find()) 
      {   
        System.out.printf("Match found: \"%s\" at position [%d, %d)%n",
                          m.group(), i, j);
      }   
    }   
  }   
}

给出示例字符串和正则表达式:

Given your sample string and regex:

allMatches("String t = 04/31 412-555-1235;", "^\\d\\d+$");

...我得到以下输出:

...I get this output:

Match found: "04" at position [11, 13)
Match found: "31" at position [14, 16)
Match found: "41" at position [17, 19)
Match found: "412" at position [17, 20)
Match found: "12" at position [18, 20)
Match found: "55" at position [21, 23)
Match found: "555" at position [21, 24)
Match found: "55" at position [22, 24)
Match found: "12" at position [25, 27)
Match found: "123" at position [25, 28)
Match found: "1235" at position [25, 29)
Match found: "23" at position [26, 28)
Match found: "235" at position [26, 29)
Match found: "35" at position [27, 29)

这篇关于有效查找正则表达式的所有重叠匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆