返回代码点的字符在C# [英] Return code point of characters in C#
问题描述
我可以返回字符的代码点吗?例如,如果输入为A,则输出应为U + 0041。任何帮助吗?非常感谢。
Can I return the code point of a character? For example, if the input is "A", then the output should be "U+0041". Any help please? Many thanks.
推荐答案
很容易,因为C#中的字符实际上是UTF16代码点:
Easy, since chars in C# is actually UTF16 code points:
char x = 'A';
Console.WriteLine("U+{0:x4}", (int)x);
要处理注释,A char
C#是一个16位数字,并且保存一个UTF16代码点。 16以上的代码点的位空间不能用C#字符表示。 C#中的字符不是可变宽度。但是,一个字符串可以有2个字符,每个都是一个代码单元,形成一个UTF16代码点。如果你有一个字符串输入和字符高于16位空间,你可以使用 char.IsSurrogatePair
和 Char.ConvertToUtf32
,如另一个答案中所建议的:
To address the comments, A char
in C# is a 16 bit number, and holds a UTF16 code point. Code points above 16 the bit space cannot be represented in a C# character. Characters in C# is not variable width. A string however can have 2 chars following each other, each being a code unit, forming a UTF16 code point. If you have a string input and characters above the 16 bit space, you can use char.IsSurrogatePair
and Char.ConvertToUtf32
, as suggested in another answer:
string input = ....
for(int i = 0 ; i < input.Length ; i += Char.IsSurrogatePair(input,i) ? 2 : 1)
{
int x = Char.ConvertToUtf32(input, i);
Console.WriteLine("U+{0:X4}", x);
}
这篇关于返回代码点的字符在C#的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!