Java用大写字母在特定字符前后(前后)替换字符 [英] Java replace characters with uppercase around (before and after) specific character

查看:90
本文介绍了Java用大写字母在特定字符前后(前后)替换字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有这种输入

word w'ord wo'rd

我需要在单词的开头和'字符(可以多次存在)之后立即将两个字符都转换为大写.

I need to convert to uppercase both characters at the starts of the word and right after the ' character (which can exists multiple times).

我需要的输出(使用前面的示例)是

The output I need (using the previous example) is

word W'Ord Wo'Rd

我尝试了一个简单的模式

I tried with a simple pattern

s.replaceAll("(\\w)(\\w*)'(\\w)", "$1");

但是我无法将组1和3转换为大写

but I'm unable to convert the group 1 and 3 to uppercase

在发现主要问题中的一个小错误之后,我编辑了@Wiktor Stribizew代码,以包括我错过的案例.

After I discovered a little mistake in the main question, I edited @Wiktor Stribizew code in order to include the case I missed.

Matcher m = Pattern.compile("(\\w)(\\w*)'(\\w)").matcher(s);
StringBuffer result = new StringBuffer();
while (m.find()) {
    m.appendReplacement(result, m.group(1).toUpperCase() + m.group(2) + "'" + m.group(3).toUpperCase());
}
m.appendTail(result);
s = result.toString();

推荐答案

您需要在Java中使用 Matcher#appendReplacement 才能处理匹配项.这是一个示例:

You need to use Matcher#appendReplacement in Java to be able to process the match. Here is an example:

String s = "word w'ord wo'rd";
StringBuffer result = new StringBuffer();
Matcher m = Pattern.compile("\\b(\\w)(\\w*)'(\\w(?:'\\w)*)").matcher(s);
while (m.find()) {
    m.appendReplacement(result, 
        m.group(1).toUpperCase()+m.group(2) + "'" + m.group(3).toUpperCase());
}
m.appendTail(result);
System.out.println(result.toString());
// => word W'Ord Wo'Rd

请参见 Java演示

等效于Java 9+(演示):

Java 9+ equivalent (demo):

String s = "wo'rd w'ord wo'r'd";
Matcher m = Pattern.compile("\\b(\\w)(\\w*)'(\\w(?:'\\w)*)").matcher(s);
System.out.println(
    m.replaceAll(r -> r.group(1).toUpperCase()+r.group(2) + "'" + r.group(3).toUpperCase())
);
//wo'rd w'ord wo'r'd => Wo'Rd W'Ord Wo'R'D
//word w'ord wo'rd => word W'Ord Wo'Rd

模式细分:

  • \ b -前导词边界
  • (\ w)-第1组:单个字符char
  • (\ w *)-第2组:零个或多个单词字符
  • '-单引号
  • (\ w(?:'\ w)*)-第3组:
    • \ w -一个单词char
    • (?:'\ w)* -零个或多个序列:
      • '-单引号
      • \ w -一个单词char.
      • \b - a leading word boundary
      • (\w) - Group 1: a single word char
      • (\w*) - Group 2: zero or more word chars
      • ' - a single quote
      • (\w(?:'\w)*) - Group 3:
        • \w - a word char
        • (?:'\w)* - zero or more sequences of:
          • ' - a single quote
          • \w - a word char.

          现在,如果要使模式更精确,则可以使用 \ p {Ll} \ p {Ll} 更改应该与小写字母匹配的 \ w 应该与 \ p {L} 匹配的任何字母的 \ w .模式看起来像(?U)\\ b(\\ p {Ll})(\\ p {L} *)'(\\ p {Ll}(?:'\\ p {Ll})*)"-但是,如果小写字母之前有大写字母(例如 w'r'D's -> Wo'R'D's ).(?U) Pattern.UNICODE_CHARACTER_CLASS 的内联修饰符,它使 \ b 字边界能够识别Unicode.

          Now, if you want to make the pattern more precise, you can change the \w that are supposed to match lowercase letters with \p{Ll} and the \w that is supposed to match any letter with \p{L}. The pattern would look like "(?U)\\b(\\p{Ll})(\\p{L}*)'(\\p{Ll}(?:'\\p{Ll})*)" - however, you risk to leave letters in lowercase (those after ') if there are uppercase before lowercase ones (like in wo'r'D's -> Wo'R'D's). (?U) is a Pattern.UNICODE_CHARACTER_CLASS inline modifier that makes \b word boundary Unicode-aware.

          这篇关于Java用大写字母在特定字符前后(前后)替换字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆