你将如何得到的Unicode代码点从.NET字符串数组？ [英] How would you get an array of Unicode code points from a .NET String?

查看：186 发布时间：2016/9/8 18:51:28 c# string unicode char astral-plane

本文介绍了你将如何得到的Unicode代码点从.NET字符串数组？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有我需要核对串字符范围限制列表，但字符键入.NET是UTF-16，因此一些字符变得古怪（代理）对，而不是。因此，枚举所有的在字符串字符 ，我没有得到32位的Unicode代码点的时候并具有高值的一些比较失败。

I have a list of character range restrictions that I need to check a string against, but the char type in .NET is UTF-16 and therefore some characters become wacky (surrogate) pairs instead. Thus when enumerating all the char's in a string, I don't get the 32-bit Unicode code points and some comparisons with high values fail.

我理解Unicode的不够好，如果有必要，我可以解析字节自己，但我正在寻找一个C＃/。NET框架BCL解决方案。所以...

I understand Unicode well enough that I could parse the bytes myself if necessary, but I'm looking for a C#/.NET Framework BCL solution. So ...

你会如何转换字符串到一个数组（ INT [ ] ）的32位Unicode代码点？

How would you convert a string to an array (int[]) of 32-bit Unicode code points?

推荐答案

这个答案是不正确的。见@ Virtlink的答案正确的。

static int[] ExtractScalars(string s)
{
  if (!s.IsNormalized())
  {
    s = s.Normalize();
  }

  List<int> chars = new List<int>((s.Length * 3) / 2);

  var ee = StringInfo.GetTextElementEnumerator(s);

  while (ee.MoveNext())
  {
    string e = ee.GetTextElement();
    chars.Add(char.ConvertToUtf32(e, 0));
  }

  return chars.ToArray();
}

备注：标准化来处理复合字符。

Notes: Normalization is required to deal with composite characters.

这篇关于你将如何得到的Unicode代码点从.NET字符串数组？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

你将如何得到的Unicode代码点从.NET字符串数组？ [英] How would you get an array of Unicode code points from a .NET String?

问题描述

推荐答案

相关文章

C#/.NET最新文章

热门教程

热门工具

登录关闭

你将如何得到的Unicode代码点从.NET字符串数组？ [英] How would you get an array of Unicode code points from a .NET String?

问题描述

推荐答案

相关文章

C#/.NET最新文章

热门教程

热门工具

登录 关闭

登录关闭