将Unicode字符转换为ASCII(.NET)中最接近(最相似)的字符 [英] Convert Unicode char to closest (most similar) char in ASCII (.NET)
问题描述
如何将不同的Unicode字符转换为最接近的ASCII等价物?像Ä - > A.我google了,但没有找到任何合适的解决方案。技巧 Encoding.ASCII.GetBytes(Ä)[0]
无效。 (结果为?
)。
How do I to convert different Unicode characters to their closest ASCII equivalents? Like Ä -> A. I googled but didn't find any suitable solution. The trick Encoding.ASCII.GetBytes("Ä")[0]
didn't work. (Result was ?
).
我发现有一个类 Encoder
有一个回退
属性是不能转换的,但实现( EncoderReplacementFallback
)是愚蠢的并转换为?
。
I found that there is a class Encoder
that has a Fallback
property that is exactly for cases when char
can't be converted, but implementations (EncoderReplacementFallback
) are stupid and convert to ?
.
任何想法?
推荐答案
If it is just removing of the diacritical marks, then head to this answer:
static string RemoveDiacritics(string stIn) {
string stFormD = stIn.Normalize(NormalizationForm.FormD);
StringBuilder sb = new StringBuilder();
for(int ich = 0; ich < stFormD.Length; ich++) {
UnicodeCategory uc = CharUnicodeInfo.GetUnicodeCategory(stFormD[ich]);
if(uc != UnicodeCategory.NonSpacingMark) {
sb.Append(stFormD[ich]);
}
}
return(sb.ToString().Normalize(NormalizationForm.FormC));
}
这篇关于将Unicode字符转换为ASCII(.NET)中最接近(最相似)的字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!