正则表达式在所有标点符号之间添加空格 [英] Regex add space between all punctuation

查看:301
本文介绍了正则表达式在所有标点符号之间添加空格的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要在字符串中的所有标点之间添加空格.

I need to add spaces between all punctuation in a string.

\\ "Hello: World." -> "Hello : World ."
\\ "It's 9:00?"    -> "It ' s 9 : 00 ?"
\\ "1.B,3.D!"      -> "1 . B , 3 . D !"

我认为正则表达式是要走的路,匹配所有非标点符号[a-ZA-Z\\d]+,在之前和/或之后添加一个空格,然后提取余数匹配所有标点[^a-ZA-Z\\d]+.

I think a regex is the way to go, matching all non-punctuation [a-ZA-Z\\d]+, adding a space before and/or after, then extracting the remainder matching all punctuation [^a-ZA-Z\\d]+.

但我不知道如何(递归地?)调用这个正则表达式.查看第一个示例,正则表达式只会匹配 "Hello".我正在考虑通过不断删除和附加匹配的正则表达式的第一个实例来构建一个新字符串,而原始字符串不为空.

But I don't know how to (recursively?) call this regex. Looking at the first example, the regex will only match the "Hello". I was thinking of just building a new string by continuously removing and appending the first instance of the matched regex, while the original string is not empty.

private String addSpacesBeforePunctuation(String s) {
    StringBuilder builder = new StringBuilder();
    final String nonpunctuation = "[a-zA-Z\\d]+";
    final String punctuation = "[^a-zA-Z\\d]+";

    String found;
    while (!s.isEmpty()) {

        // regex stuff goes here

        found = ???; // found group from respective regex goes here
        builder.append(found);
        builder.append(" ");
        s = s.replaceFirst(found, "");
    }

    return builder.toString().trim();
}

然而,这感觉不是正确的方法......我想我把事情复杂化了......

However this doesn't feel like the right way to go... I think I'm over complicating things...

推荐答案

您可以使用 Java 中的标点属性 \p{Punct} 使用基于环视的正则表达式:

You can use lookarounds based regex using punctuation property \p{Punct} in Java:

str = str.replaceAll("(?<=\\S)(?:(?<=\\p{Punct})|(?=\\p{Punct}))(?=\\S)", " ");

  • (?<=\\S) 如果上一个字符不是空格,则断言
  • (?<=\\p{Punct}) 如果前一个字符是标点字符,则断言一个位置
  • (?=\\p{Punct}) 如果下一个字符是标点字符,则断言一个位置
  • (?=\\S) 如果下一个字符不是空格,则断言
    • (?<=\\S) Asserts if prev char is not a white-space
    • (?<=\\p{Punct}) asserts a position if previous char is a punctuation char
    • (?=\\p{Punct}) asserts a position if next char is a punctuation char
    • (?=\\S) Asserts if next char is not a white-space
    • IdeOne 演示

      这篇关于正则表达式在所有标点符号之间添加空格的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆