将 Unicode 字符转换为 ASCII (.NET) 中最接近(最相似)的字符 [英] Convert Unicode char to closest (most similar) char in ASCII (.NET)
问题描述
如何将不同的 Unicode 字符转换为最接近的 ASCII 字符?就像 Ä -> A. 我用谷歌搜索但没有找到任何合适的解决方案.技巧 Encoding.ASCII.GetBytes("Ä")[0]
不起作用.(结果是?
).
How do I to convert different Unicode characters to their closest ASCII equivalents? Like Ä -> A. I googled but didn't find any suitable solution. The trick Encoding.ASCII.GetBytes("Ä")[0]
didn't work. (Result was ?
).
我发现有一个 Encoder
类,它有一个 Fallback
属性,它完全适用于 char
无法转换的情况,但是实现 (EncoderReplacementFallback
) 很愚蠢并转换为 ?
.
I found that there is a class Encoder
that has a Fallback
property that is exactly for cases when char
can't be converted, but implementations (EncoderReplacementFallback
) are stupid and convert to ?
.
有什么想法吗?
推荐答案
If it is just removing of the diacritical marks, then head to this answer:
static string RemoveDiacritics(string stIn) {
string stFormD = stIn.Normalize(NormalizationForm.FormD);
StringBuilder sb = new StringBuilder();
for(int ich = 0; ich < stFormD.Length; ich++) {
UnicodeCategory uc = CharUnicodeInfo.GetUnicodeCategory(stFormD[ich]);
if(uc != UnicodeCategory.NonSpacingMark) {
sb.Append(stFormD[ich]);
}
}
return(sb.ToString().Normalize(NormalizationForm.FormC));
}
这篇关于将 Unicode 字符转换为 ASCII (.NET) 中最接近(最相似)的字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!