如何替换java String中的字符? [英] How to replace characters in a java String?

查看:172
本文介绍了如何替换java String中的字符?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我喜欢以有效的方式用相应的替换字符替换字符串的某组字符。

I like to replace a certain set of characters of a string with a corresponding replacement character in an efficent way.

例如:

String sourceCharacters = "šđćčŠĐĆČžŽ";
String targetCharacters = "sdccSDCCzZ";

String result = replaceChars("Gračišće", sourceCharacters , targetCharacters );

Assert.equals(result,"Gracisce") == true;

是否有比使用 replaceAll String类的方法?

Is there are more efficient way than to use the replaceAll method of the String class?

我的第一个想法是:

final String s = "Gračišće";
String sourceCharacters = "šđćčŠĐĆČžŽ";
String targetCharacters = "sdccSDCCzZ";

// preparation
final char[] sourceString = s.toCharArray();
final char result[] = new char[sourceString.length];
final char[] targetCharactersArray = targetCharacters.toCharArray();

// main work
for(int i=0,l=sourceString.length;i<l;++i)
{
  final int pos = sourceCharacters.indexOf(sourceString[i]);
  result[i] = pos!=-1 ? targetCharactersArray[pos] : sourceString[i];
}

// result
String resultString = new String(result);

任何想法?

Btw,the UTF-8字符引起麻烦,US_ASCII可以正常工作。

Btw, the UTF-8 characters are causing the trouble, with US_ASCII it works fine.

推荐答案

你可以使用 java.text.Normalizer 和一个正则表达式,以摆脱存在的变音符号 远远超过你收集的

You can make use of java.text.Normalizer and a shot of regex to get rid of the diacritics of which there exist much more than you have collected as far.

这是一个 SSCCE ,copy'n'paste'n'run它在Java 6上:

Here's an SSCCE, copy'n'paste'n'run it on Java 6:

package com.stackoverflow.q2653739;

import java.text.Normalizer;
import java.text.Normalizer.Form;

public class Test {

    public static void main(String... args) {
        System.out.println(removeDiacriticalMarks("Gračišće"));
    }

    public static String removeDiacriticalMarks(String string) {
        return Normalizer.normalize(string, Form.NFD)
            .replaceAll("\\p{InCombiningDiacriticalMarks}+", "");
    }
}

这应该产生

Gracisce

至少,它在Eclipse中将控制台字符编码设置为UTF-8(窗口>首选项>常规>工作区>文本文件编码)。确保在您的环境中也设置相同。

At least, it does here at Eclipse with console character encoding set to UTF-8 (Window > Preferences > General > Workspace > Text File Encoding). Ensure that the same is set in your environment as well.

作为替代方案,维护地图<字符,字符>

Map<Character, Character> charReplacementMap = new HashMap<Character, Character>();
charReplacementMap.put('š', 's');
charReplacementMap.put('đ', 'd');
// Put more here.

String originalString = "Gračišće";
StringBuilder builder = new StringBuilder();

for (char currentChar : originalString.toCharArray()) {
    Character replacementChar = charReplacementMap.get(currentChar);
    builder.append(replacementChar != null ? replacementChar : currentChar);
}

String newString = builder.toString();

这篇关于如何替换java String中的字符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆