正则表达式,用于在未被单引号或双引号括起时使用空格分割字符串 [英] Regex for splitting a string using space when not surrounded by single or double quotes

查看:205
本文介绍了正则表达式,用于在未被单引号或双引号括起时使用空格分割字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是正则表达的新手,非常感谢你的帮助。我正在尝试将一个表达式组合在一起,该表达式将使用未被单引号或双引号括起的所有空格分割示例字符串。我的最后一次尝试看起来像这样:(?!)并且工作不正常。它在报价前的空格上分开。

I'm new to regular expressions and would appreciate your help. I'm trying to put together an expression that will split the example string using all spaces that are not surrounded by single or double quotes. My last attempt looks like this: (?!") and isn't quite working. It's splitting on the space before the quote.

示例输入:

This is a string that "will be" highlighted when your 'regular expression' matches something.

所需的输出:

This
is
a
string
that
will be
highlighted
when
your
regular expression
matches
something.

注意将是'正则表达式'保留单词之间的空格。

Note that "will be" and 'regular expression' retain the space between the words.

推荐答案

我不明白为什么所有其他人都在提出这样复杂的正则表达式或如此长的代码。基本上,你想从你的字符串中抓取两种东西:序列不是空格或引号的字符,以及以引号开头和结尾的字符序列,机智h之间没有引号,对于两种引号。您可以使用此正则表达式轻松匹配这些内容:

I don't understand why all the others are proposing such complex regular expressions or such long code. Essentially, you want to grab two kinds of things from your string: sequences of characters that aren't spaces or quotes, and sequences of characters that begin and end with a quote, with no quotes in between, for two kinds of quotes. You can easily match those things with this regular expression:

[^\s"']+|"([^"]*)"|'([^']*)'

我添加了捕获组,因为你不希望列表中的引号。

I added the capturing groups because you don't want the quotes in the list.

此Java代码构建列表,添加捕获组(如果匹配以排除引号),并添加整体正则表达式匹配如果捕获组不匹配(未加引号的单词匹配)。

This Java code builds the list, adding the capturing group if it matched to exclude the quotes, and adding the overall regex match if the capturing group didn't match (an unquoted word was matched).

List<String> matchList = new ArrayList<String>();
Pattern regex = Pattern.compile("[^\\s\"']+|\"([^\"]*)\"|'([^']*)'");
Matcher regexMatcher = regex.matcher(subjectString);
while (regexMatcher.find()) {
    if (regexMatcher.group(1) != null) {
        // Add double-quoted string without the quotes
        matchList.add(regexMatcher.group(1));
    } else if (regexMatcher.group(2) != null) {
        // Add single-quoted string without the quotes
        matchList.add(regexMatcher.group(2));
    } else {
        // Add unquoted word
        matchList.add(regexMatcher.group());
    }
} 

如果您不介意在报价中加注返回列表,你可以使用更简单的代码:

If you don't mind having the quotes in the returned list, you can use much simpler code:

List<String> matchList = new ArrayList<String>();
Pattern regex = Pattern.compile("[^\\s\"']+|\"[^\"]*\"|'[^']*'");
Matcher regexMatcher = regex.matcher(subjectString);
while (regexMatcher.find()) {
    matchList.add(regexMatcher.group());
} 

这篇关于正则表达式,用于在未被单引号或双引号括起时使用空格分割字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆