正则表达式忽略引号之间的文本 [英] RegEx To Ignore Text Between Quotes
本文介绍了正则表达式忽略引号之间的文本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个正则表达式,它是 [\\.|\\;|\\?|\\!][\\s]
这用于拆分字符串.但我不希望它拆分 .;?!
如果它在引号中.
I have a Regex, which is [\\.|\\;|\\?|\\!][\\s]
This is used to split a string. But I don't want it to split . ; ? !
if it is in quotes.
推荐答案
我不使用 split 而是使用 Pattern &匹配器代替.
I'd not use split but Pattern & Matcher instead.
演示:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Main {
public static void main(String[] args) {
String text = "start. \"in quotes!\"; foo? \"more \\\" words\"; bar";
String simpleToken = "[^.;?!\\s\"]+";
String quotedToken =
"(?x) # enable inline comments and ignore white spaces in the regex \n" +
"\" # match a double quote \n" +
"( # open group 1 \n" +
" \\\\. # match a backslash followed by any char (other than line breaks) \n" +
" | # OR \n" +
" [^\\\\\r\n\"] # any character other than a backslash, line breaks or double quote \n" +
") # close group 1 \n" +
"* # repeat group 1 zero or more times \n" +
"\" # match a double quote \n";
String regex = quotedToken + "|" + simpleToken;
Matcher m = Pattern.compile(regex).matcher(text);
while(m.find()) {
System.out.println("> " + m.group());
}
}
}
产生:
> start
> "in quotes!"
> foo
> "more \" words"
> bar
如您所见,它还可以处理引号内的转义引号.
As you can see, it can also handle escaped quotes inside quoted tokens.
这篇关于正则表达式忽略引号之间的文本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文