Java中的正则表达式命名组 [英] Regex Named Groups in Java

查看:562
本文介绍了Java中的正则表达式命名组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

据我了解, java.regex 包不支持命名组( http://www.regular-expressions.info/named.html )所以有人能指向我的第三方图书馆吗?

It is my understanding that the java.regex package does not have support for named groups (http://www.regular-expressions.info/named.html) so can anyone point me towards a third-party library that does?

我看过 jregex ,但最后一次发布于2002年,在java5下它对我不起作用(不可否认我只是简单地尝试过)。

I've looked at jregex but its last release was in 2002 and it didn't work for me (admittedly I only tried briefly) under java5.

推荐答案

更新 2011年8月

geofflane 他的答案中提到, Java 7现在支持命名组 < br>
tchrist 在评论中指出支持是有限的。

详细说明了他的答案中的限制 Java Regex Helper

As geofflane mentions in his answer, Java 7 now support named groups.
tchrist points out in the comment that the support is limited.
He details the limitations in his great answer "Java Regex Helper"

Java 7正则表达式命名组支持在 2010年9月

Java 7 regex named group support was presented back in September 2010 in Oracle's blog.

在Java 7的正式版本中,支持命名捕获的构造组是:

In the official release of Java 7, the constructs to support the named capturing group are:


  • (?<name>capturing text) to define a named group "name"
  • \k<name> to backreference a named group "name"
  • ${name} to reference to captured group in Matcher's replacement string
  • Matcher.group(String name) to return the captured input subsequence by the given "named group".






Java 7之前的其他替代方案

  • Google named-regex (see John Hardy's answer)
    Gábor Lipták mentions (November 2012) that this project might not be active (with several outstanding bugs), and its GitHub fork could be considered instead.
  • jregex (See Brian Clozel's answer)

原始答案 2009年1月,现在已经破了下两个链接)

(Original answer: Jan 2009, with the next two links now broken)

你可以不要引用命名组,除非你编写自己的Regex版本......

You can not refer to named group, unless you code your own version of Regex...

这正是 Gorbush2在此帖子中做了

Regex2

(有限实施,正如 tchrist 再次指出的那样用于ASCII标识符。 tchrist将限制细节描述为:

(limited implementation, as pointed out again by tchrist, as it looks only for ASCII identifiers. tchrist details the limitation as:


每个相同名称只能拥有一个命名组(您无法控制! )并且无法将它们用于正则表达式递归。

only being able to have one named group per same name (which you don’t always have control over!) and not being able to use them for in-regex recursion.

注意:您可以在Perl和PCRE中找到真正的正则表达式递归示例正如 Regexp Power 中所述,正则表达式 PCRE规范匹配字符串与平衡括号幻灯片)

Note: You can find true regex recursion examples in Perl and PCRE regexes, as mentioned in Regexp Power, PCRE specs and Matching Strings with Balanced Parentheses slide)

示例:

字符串:

"TEST 123"

RegExp:

"(?<login>\\w+) (?<id>\\d+)"

访问

matcher.group(1) ==> TEST
matcher.group("login") ==> TEST
matcher.name(1) ==> login

替换

matcher.replaceAll("aaaaa_$1_sssss_$2____") ==> aaaaa_TEST_sssss_123____
matcher.replaceAll("aaaaa_${login}_sssss_${id}____") ==> aaaaa_TEST_sssss_123____ 






(从实施中摘录)


(extract from the implementation)

public final class Pattern
    implements java.io.Serializable
{
[...]
    /**
     * Parses a group and returns the head node of a set of nodes that process
     * the group. Sometimes a double return system is used where the tail is
     * returned in root.
     */
    private Node group0() {
        boolean capturingGroup = false;
        Node head = null;
        Node tail = null;
        int save = flags;
        root = null;
        int ch = next();
        if (ch == '?') {
            ch = skip();
            switch (ch) {

            case '<':   // (?<xxx)  look behind or group name
                ch = read();
                int start = cursor;
[...]
                // test forGroupName
                int startChar = ch;
                while(ASCII.isWord(ch) && ch != '>') ch=read();
                if(ch == '>'){
                    // valid group name
                    int len = cursor-start;
                    int[] newtemp = new int[2*(len) + 2];
                    //System.arraycopy(temp, start, newtemp, 0, len);
                    StringBuilder name = new StringBuilder();
                    for(int i = start; i< cursor; i++){
                        name.append((char)temp[i-1]);
                    }
                    // create Named group
                    head = createGroup(false);
                    ((GroupTail)root).name = name.toString();

                    capturingGroup = true;
                    tail = root;
                    head.next = expr(tail);
                    break;
                }

这篇关于Java中的正则表达式命名组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆