多次捕获组 [英] Capture group multiple times

查看：42 发布时间：2021/7/7 18:30:47 java regex regex-group

本文介绍了多次捕获组的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

最近我一直在玩 Java 中的正则表达式，我发现自己遇到了一个(理论上)很容易解决的问题，但我在徘徊是否有更简单的方法来做到这一点(是的，是的，我很懒)，问题是多次捕获一组，这是:

Lately I have being playing around with regex in Java, and I find myself into a problem which (theoretically) is easy to solve, but I was wandering if there is any easier way to do it (Yes, yes I am lazy), the problem is capture a group multiple times, this is:

public static void main(String[] args) {
    Pattern p = Pattern.compile("A (IvI(.*?)IvI)*? A");
    Matcher m = p.matcher("A IvI asd IvI IvI qwe IvI A"); //ANY NUMBER of IvI x IvI
    //Matcher m = p.matcher("A  A");
    int loi = 0; //last Occurrence Index
    String storage;
    while (loi >= 0 && m.find(loi)) {
        System.out.println(m.group(1));
        if ((storage = m.group(2)) != null) {
            System.out.println(storage);
        }
        //System.out.println(m.group(1));
        loi = m.end(1);
    }
    m.find();
    System.out.println("2 opt");
    Pattern p2 = Pattern.compile("IvI(.*?)IvI");
    Matcher m2 = p2.matcher(m.group(1)); //m.group(1) = "IvI asd IvI IvI qwe IvI"
    loi = 0;
    while (loi >= 0 && m2.find(loi)) {
        if ((storage = m2.group(1)) != null) {
            System.out.println(storage);
        }
        loi = m2.end(0);
    }
}

使用 ONLY Pattern p 有什么办法可以得到 IvI's 里面的东西吗?(在测试中字符串将是 "asd" 和 "qwe") 考虑到可能有任意数量的 IvI's 部分，类似于我在第一次尝试做的事情，即找到第一次出现的组，然后移动索引并搜索下一组等等......

Using ONLY Pattern p is there any way to get what is inside IvI's? (in the test string would be "asd" and "qwe") considering that there could be any number of IvI's sections, something alike of what I am trying to do in the first while which is, finding the first occurrence of the group, then moving the index and search for the next group and so on and so on...

使用我写的代码，虽然它返回 asd IvI IvI qwe 作为组 2，而不仅仅是 asd 然后是 qwe，部分我想这可能是因为 (.*?) 部分，不应该是贪婪的，但它仍然上升到 qwe 消耗两个 IvI 的>，我提到这一点是因为否则我可以使用 matcher.find(anInt) 方法的结束索引，但它也不起作用；我不认为正则表达式有什么问题，因为下一个代码可以在不消耗 IvI 的情况下工作.

Using the code I wrote in that while it returns asd IvI IvI qwe as the group 2, not just asd and then qwe, in part I suppose it could be because of the (.*?) part, is is not supposed to be greedy but still it goes up to the qwe consuming two of the IvI's, I mention this because otherwise I may be able to use the end index of those with the matcher.find(anInt) method, but it does not work either; I don't think it is anything wrong with the regex, since the next code works without consuming the IvI.

public static void main(String[] args) {
    Pattern p = Pattern.compile("(.*?)IvI");
    Matcher m = p.matcher("bla bla blaIvI");
    m.find();
    System.out.println(m.group(1));
}

打印:bla bla bla

有一个我知道的解决方案(但我懒得记住)

THERE IS A SOLUTION I KNOW (but I am lazy remember)

(同样在第一个代码上，下面是2 opt"消息)解决方案是将其划分为子组并使用另一个正则表达式，一次只处理一个子组...

(Also on the first code, bellow "2 opt" message) The solution is dividing it into sub-groups and use another regex where you process only those sub-groups one at a time...

顺便说一句:我做了功课在 this 页面中提到

BTW: I did my homework In this page it mentions

由于带有量词的捕获组保留其编号，因此当您检查该组时，引擎返回什么值?所有引擎都返回最后捕获的值.例如，如果您将字符串 A_B_C_D_ 与 ([A-Z])+ 匹配，当您检查匹配时，第 1 组将是 D.除 .NET 引擎外，所有中间值都将丢失.本质上，每次匹配模式时，第 1 组都会被覆盖.

Since a capture group with a quantifier holds on to its number, what value does the engine return when you inspect the group? All engines return the last value captured. For instance, if you match the string A_B_C_D_ with ([A-Z])+, when you inspect the match, Group 1 will be D. With the exception of the .NET engine, all intermediate values are lost. In essence, Group 1 gets overwritten each time its pattern is matched.

但我还是希望你给我个好消息...

But I am still hoping you to give me good news...

多次捕获组 [英] Capture group multiple times

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

多次捕获组 [英] Capture group multiple times

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭