如何从Java中的String中检测重复的单词? [英] How to detect duplicate words from a String in Java?
问题描述
可以检测到String中的重复单词的方式是什么?
例如这是一个重复测试的测试消息包含一个重复的单词测试。
这里的目标是检测字符串中出现的所有重复单词。 >
使用正则表达式更适合实现目标。
以下Java代码解决了从String中检测重复的问题。如果重复的单词用换行符或标点符号分隔,则不应该有任何问题。
String duplicatePattern =(?i) \\b(\\w +)\\b [\\w\\W] * \\b\\1\\b;
模式p = Pattern.compile(duplicatePattern);
字符串短语=这是#$;%@;<>?\\` p是一个重复测试的Test \;
Matcher m = p.matcher(phrase);
String val = null;
while(m.find()){
val = m.group();
System.out.println(Matching segment is \+ val +\);
System.out.println(重复单词:+ m.group(1)+\\\
);
}
代码的输出将是:
匹配段是是#$;%@;<>?| \` p是一个是
重复单词:
匹配段是Test
的重复测试
重复单词:Test
这里,m.group(1)语句表示与第一组模式匹配的字符串[这里,它是(\\\w +)]。
What are the ways by which duplicate word in a String can be detected?
e.g. "this is a test message for duplicate test" contains one duplicate word test.
Here, the objective is to detect all duplicate words which occur in a String.
Use of regular expression is preferable to achieve the goal.
The following Java code resolves the problem of detecting duplicates from a String. There should not be any problem if the duplicate word is separated by newline or punctuation symbols.
String duplicatePattern = "(?i)\\b(\\w+)\\b[\\w\\W]*\\b\\1\\b";
Pattern p = Pattern.compile(duplicatePattern);
String phrase = "this is#$;%@;<>?|\\` p is a is Test\n of duplicate test";
Matcher m = p.matcher(phrase);
String val = null;
while (m.find()) {
val = m.group();
System.out.println("Matching segment is \"" + val + "\"");
System.out.println("Duplicate word: " + m.group(1)+ "\n");
}
The output of the code will be:
Matching segment is "is#$;%@;<>?|\` p is a is"
Duplicate word: is
Matching segment is "Test
of duplicate test"
Duplicate word: Test
Here, m.group(1) statement represents the String matched against 1st group of Pattern [here, it's (\\w+)].
这篇关于如何从Java中的String中检测重复的单词?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!