Java 正则表达式捕获组 [英] Java Regex Capturing Groups

查看:49
本文介绍了Java 正则表达式捕获组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试理解这个代码块.在第一个中,我们在表达式中寻找什么?

I am trying to understand this code block. In the first one, what is it we are looking for in the expression?

我的理解是,它是任何字符(0 次或多次 *)后跟 0 到 9 之间的任何数字(一次或多次 +)后跟任何字符(0 次或多次 *).

My understanding is that it is any character (0 or more times *) followed by any number between 0 and 9 (one or more times +) followed by any character (0 or more times *).

执行时的结果是:

Found value: This order was placed for QT3000! OK?
Found value: This order was placed for QT300
Found value: 0

有人可以和我一起完成这个吗?

Could someone please go through this with me?

使用捕获组的优势是什么?

What is the advantage of using Capturing groups?

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class RegexTut3 {

    public static void main(String args[]) {
        String line = "This order was placed for QT3000! OK?"; 
        String pattern = "(.*)(\d+)(.*)";

        // Create a Pattern object
        Pattern r = Pattern.compile(pattern);

        // Now create matcher object.
        Matcher m = r.matcher(line);

        if (m.find()) {
            System.out.println("Found value: " + m.group(0));
            System.out.println("Found value: " + m.group(1));
            System.out.println("Found value: " + m.group(2));
        } else {
            System.out.println("NO MATCH");
        }
    }

}

推荐答案

您遇到的问题是量词的类型.您在第一组中使用了 greedy 量词(索引 1 - 索引 0 代表整个 Pattern),这意味着它将匹配尽可能多(并且因为它是任何字符,它会匹配尽可能多的字符,以满足下一组的条件).

The issue you're having is with the type of quantifier. You're using a greedy quantifier in your first group (index 1 - index 0 represents the whole Pattern), which means it'll match as much as it can (and since it's any character, it'll match as many characters as there are in order to fulfill the condition for the next groups).

简而言之,您的第一组 .* 匹配任何内容,只要下一组 \d+ 可以匹配某些内容(在这种情况下,最后一位数字).

In short, your 1st group .* matches anything as long as the next group \d+ can match something (in this case, the last digit).

根据第三组,它将匹配最后一位数字之后的任何内容.

As per the 3rd group, it will match anything after the last digit.

如果您将其更改为第一组中的 reluctant 量词,您将得到我想您所期望的结果,即 3000 部分.

If you change it to a reluctant quantifier in your 1st group, you'll get the result I suppose you are expecting, that is, the 3000 part.

注意第一组中的问号.

String line = "This order was placed for QT3000! OK?";
Pattern pattern = Pattern.compile("(.*?)(\d+)(.*)");
Matcher matcher = pattern.matcher(line);
while (matcher.find()) {
    System.out.println("group 1: " + matcher.group(1));
    System.out.println("group 2: " + matcher.group(2));
    System.out.println("group 3: " + matcher.group(3));
}

输出:

group 1: This order was placed for QT
group 2: 3000
group 3: ! OK?

关于 Java Pattern 的更多信息 此处.

More info on Java Pattern here.

最后,捕获组由圆括号分隔,一旦您的 Pattern 与输入匹配,就提供了一种非常有用的使用反向引用(除其他外)的方法.

Finally, the capturing groups are delimited by round brackets, and provide a very useful way to use back-references (amongst other things), once your Pattern is matched to the input.

在 Java 6 中,组只能通过它们的顺序来引用(注意嵌套组和排序的微妙之处).

In Java 6 groups can only be referenced by their order (beware of nested groups and the subtlety of ordering).

在 Java 7 中,这要容易得多,因为您可以使用命名组.

In Java 7 it's much easier, as you can use named groups.

这篇关于Java 正则表达式捕获组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆