如何将变音字符更改为非变音符号 [英] How to change diacritic characters to non-diacritic ones
问题描述
http://stackoverflow.com/questions/285228/how-to-convert-utf-8-to-us-ascii-in-java#285791\">我自己对另一个问题的回答:
而不是创建自己的表,而是将文本转换为标准化形式D,其中字符表示为基本字符加上变音符号实例,á将替换为a,后面紧跟一个组合的急性口音)。
表仍然存在,但现在是来自Unicode标准的表。
您还可以尝试NFKD而不是NFD,以捕获更多的情况。
参考文献:
- http://unicode.org/reports/tr15/
- http:// www .siao2.com / 2005/02/19 / 376617.aspx
- http://www.siao2.com/2007/05/14/2629747.aspx
I've found a answer how to remove diacritic characters on stackoverflow, but could you please tell me if it is possible to change diacritic characters to non-diacritic ones?
Oh.. and I think about .NET (or other if not possible)
Copying from my own answer to another question:
Instead of creating your own table, you could instead convert the text to normalization form D, where the characters are represented as a base character plus the diacritics (for instance, "á" will be replaced by "a" followed by a combining acute accent). You can then strip everything which is not an ASCII letter.
The tables still exist, but are now the ones from the Unicode standard.
You could also try NFKD instead of NFD, to catch even more cases.
References:
这篇关于如何将变音字符更改为非变音符号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!