在Java中匹配字符串中的单词 [英] Match word in String in Java
问题描述
我正在尝试在Java中匹配包含单词#SP
(没有引号,不区分大小写)的字符串。但是,我发现使用正则表达式非常困难!
I'm trying to match Strings that contain the word "#SP"
(sans quotes, case insensitive) in Java. However, I'm finding using Regexes very difficult!
我需要匹配的字符串:
这是一个示例#sp字符串
,
#SP string text ...
,
String text# Sp
Strings I need to match:
"This is a sample #sp string"
,
"#SP string text..."
,
"String text #Sp"
字符串我不想匹配:
任何带#Spider的东西
,
#Spin #Spoon #SPORK
Strings I do not want to match:
"Anything with #Spider"
,
"#Spin #Spoon #SPORK"
这是我的拥有到目前为止: http://ideone.com/B7hHkR 。有人可以指导我构建我的正则表达式吗?
Here's what I have so far: http://ideone.com/B7hHkR .Could someone guide me through building my regexp?
我也试过:\\\\ * * \\ * *#sp\\w * \\ *
无效。
编辑:以下是IDEone的代码:
Here's the code from IDEone:
java.util.regex.Pattern p =
java.util.regex.Pattern.compile("\\b#SP\\b",
java.util.regex.Pattern.CASE_INSENSITIVE);
java.util.regex.Matcher m = p.matcher("s #SP s");
if (m.find()) {
System.out.println("Match!");
}
推荐答案
你做得很好,但是#前面的\ b是误导性的。 \b是一个单词边界,但#已经不是单词字符(即它不在集合[0-9A-Za-z_]中)。因此,#之前的空格不被视为单词边界。更改为:
You're doing fine, but the \b in front of the # is misleading. \b is a word boundary, but # is already not a word character (i.e. it isn't in the set [0-9A-Za-z_]). Therefore, the space before the # isn't considered a word boundary. Change to:
java.util.regex.Pattern p =
java.util.regex.Pattern.compile("(^|\\s)#SP\\b",
java.util.regex.Pattern.CASE_INSENSITIVE);
(^ | \s)表示:匹配^ OR \,其中^表示字符串的开头(例如#SP String),\表示空白字符。
The (^|\s) means: match either ^ OR \s, where ^ means the beginning of your string (e.g. "#SP String"), and \s means a whitespace character.
这篇关于在Java中匹配字符串中的单词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!