将符号，重音符号转换为英文字母 [英] Converting Symbols, Accent Letters to English Alphabet

查看：120 发布时间：2018/11/26 13:09:47 java unicode special-characters diacritics

本文介绍了将符号，重音符号转换为英文字母的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

问题在于，正如您所知，中有数千个字符Unicode图表我希望将所有相似的字符转换为英文字母的字母。

The problem is that, as you know, there are thousands of characters in the Unicode chart and I want to convert all the similar characters to the letters which are in English alphabet.

例如，这里有一些转换：

For instance here are a few conversions:

ҥ->H
Ѷ->V
Ȳ->Y
Ǭ->O
Ƈ->C
tђє Ŧค๓เℓy --> the Family
...

我看到有超过20个字母的版本A / A。我不知道如何对它们进行分类。它们看起来像大海捞针。

and I saw that there are more than 20 versions of letter A/a. and I don't know how to classify them. They look like needles in the haystack.

unicode字符的完整列表位于 http://www.ssec.wisc.edu/~tomw/java/unicode.html 或 http://unicode.org/charts/charindex.html 。只需向下滚动即可看到字母的变化。

The complete list of unicode chars is at http://www.ssec.wisc.edu/~tomw/java/unicode.html or http://unicode.org/charts/charindex.html . Just try scrolling down and see the variations of letters.

如何用Java转换所有这些？请帮助我:(

How can I convert all these with Java? Please help me :(

推荐答案

从如何从.NET中的字符串中删除变音符号（重音符号）？

此方法在java 中工作正常（纯粹是为了删除变音符号也称为重音符号）。

它基本上将所有重音字符转换为deAccented对应字符，然后将它们组合成变音符号。现在你可以使用正则表达式去除变音符号。

It basically converts all accented characters into their deAccented counterparts followed by their combining diacritics. Now you can use a regex to strip off the diacritics.

import java.text.Normalizer;
import java.util.regex.Pattern;

public String deAccent(String str) {
    String nfdNormalizedString = Normalizer.normalize(str, Normalizer.Form.NFD); 
    Pattern pattern = Pattern.compile("\\p{InCombiningDiacriticalMarks}+");
    return pattern.matcher(nfdNormalizedString).replaceAll("");
}

这篇关于将符号，重音符号转换为英文字母的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

将符号，重音符号转换为英文字母 [英] Converting Symbols, Accent Letters to English Alphabet

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

将符号，重音符号转换为英文字母 [英] Converting Symbols, Accent Letters to English Alphabet

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭