在Java中使用正则表达式多次匹配一个字符串 [英] Matching one string multiple times using regex in Java

查看:989
本文介绍了在Java中使用正则表达式多次匹配一个字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在制作以下正则表达式时遇到了一些问题。我想要以下字符串:

I'm having some issues with making the following regex work. I would like the following string:

"Please enter your name here"

产生一个包含以下元素的数组:

to result in an array with the following elements:

'please enter', 'enter your', 'your name', 'name here'

目前,我使用以下模式,然后创建匹配器并按以下方式迭代:

Currently, I'm using the following pattern, and then creating a matcher and iterating in the following way:

Pattern word = Pattern.compile("[\w]+ [\w]+");
Matcher m = word.matcher("Please enter your name here");

while (m.find()) {
    wordList.add(m.group());
}

但我得到的结果是:

'please enter', 'your name'

我做错了什么? (P.,我在regexpal.com上检查了相同的正则表达式并遇到了同样的问题)。似乎同一个单词不会匹配两次。我能做些什么来达到我想要的效果?

What am I doing wrong? (P.s., i checked the same regex on regexpal.com and had the same problem). It seems like the same word won't be matched twice. What can I do to achieve the result I want?

谢谢。

---- -----------------------------

编辑:
感谢您的所有建议!我最终这样做了(因为它增加了灵活性,可以轻松指定n-gram的数量):

Thanks for all the suggestions! I ended up doing this (because it adds flexibility in being able to easily specify number of "n-grams"):

Integer nGrams = 2;
String patternTpl = "\\b[\\w']+\\b";
String concatString = "what is your age? please enter your name."
for (int i = 0; i < nGrams; i++) {
    // Create pattern.
    String pattern = patternTpl;
    for (int j = 0; j < i; j++) {
        pattern = pattern + " " + patternTpl;
    }
    pattern = "(?=(" + pattern + "))";
    Pattern word = Pattern.compile(pattern);
    Matcher m = word.matcher(concatString);

    // Iterate over all words and populate wordList
    while (m.find()) {
        wordList.add(m.group(1));
    }
}

这导致:

Pattern: 
(?=(\b[\w']+\b)) // In the first iteration
(?=(\b[\w']+\b \b[\w']+\b)) // In the second iteration

Array:
[what, is, your, age, please, enter, your, name, what is, is your, your age, please enter, enter your, your name]

注意:得到以下最佳答案的模式: Java正则表达式跳过匹配

Note: Got the pattern from the following top answer: Java regex skipping matches

推荐答案

匹配不能重叠,解释你的结果。这是一个潜在的解决方法,利用捕获组 a href =http://www.regular-expressions.info/lookaround.html\"rel =noreferrer>肯定前瞻:

The matches can't overlap, which explains your result. Here's a potential workaround, making use of capturing groups with a positive lookahead:

Pattern word = Pattern.compile("(\\w+)(?=(\\s\\w+))");
Matcher m = word.matcher("Please enter your name here");

while (m.find()) {
    System.out.println(m.group(1) + m.group(2));
}




Please enter
enter your
your name
name here

这篇关于在Java中使用正则表达式多次匹配一个字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆