将Unicode字符转换为ASCII(.NET)中最接近(最相似)的字符 [英] Convert Unicode char to closest (most similar) char in ASCII (.NET)

查看:128
本文介绍了将Unicode字符转换为ASCII(.NET)中最接近(最相似)的字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何将不同的Unicode字符转换为最接近的ASCII等价物?像Ä - > A.我google了,但没有找到任何合适的解决方案。技巧 Encoding.ASCII.GetBytes(Ä)[0] 无效。 (结果为)。

How do I to convert different Unicode characters to their closest ASCII equivalents? Like Ä -> A. I googled but didn't find any suitable solution. The trick Encoding.ASCII.GetBytes("Ä")[0] didn't work. (Result was ?).

我发现有一个类 Encoder 有一个回退属性是不能转换的,但实现( EncoderReplacementFallback )是愚蠢的并转换为

I found that there is a class Encoder that has a Fallback property that is exactly for cases when char can't be converted, but implementations (EncoderReplacementFallback) are stupid and convert to ?.

任何想法?

推荐答案

如果只是删除了变音符号,然后转到这个答案

If it is just removing of the diacritical marks, then head to this answer:

static string RemoveDiacritics(string stIn) {
  string stFormD = stIn.Normalize(NormalizationForm.FormD);
  StringBuilder sb = new StringBuilder();

  for(int ich = 0; ich < stFormD.Length; ich++) {
    UnicodeCategory uc = CharUnicodeInfo.GetUnicodeCategory(stFormD[ich]);
    if(uc != UnicodeCategory.NonSpacingMark) {
      sb.Append(stFormD[ich]);
    }
  }

  return(sb.ToString().Normalize(NormalizationForm.FormC));
}

这篇关于将Unicode字符转换为ASCII(.NET)中最接近(最相似)的字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆