定义字符串中的4字节UTF-16字符 [英] Defining 4-byte UTF-16 character in a string

查看:194
本文介绍了定义字符串中的4字节UTF-16字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已阅读有关UTF的问题-8,UTF-16和UCS-2 ,几乎所有答案都给出了UCS-2已经过时且C#使用UTF-16的语句。

I have read a question about UTF-8, UTF-16 and UCS-2 and almost all answers give the statement that UCS-2 is obsolete and C# uses UTF-16.

我所有尝试在C#中创建4字节的字符U + 1D11E失败,所以我实际上认为C#只使用UTF-16的UCS-2子集。

However, all my attempts to create the 4-byte character U+1D11E in C# failed, so I actually think C# uses the UCS-2 subset of UTF-16 only.

有我的尝试:

string s = "\u1D11E"; // gives the 2 character string "ᴑE", because \u1D11 is ᴑ
string s = (char) 0x1D11E; // won't compile because of an overflow
string s = Encoding.Unicode.GetString(new byte[] {0xD8, 0x34, 0xDD, 0x1E}); // gives 㓘ờ

C#字符串真的是UTF-16还是他们实际上是UCS-2?如果他们是UTF-16,我该如何把小提琴谱列到我的C#字符串?

Are C# strings really UTF-16 or are they actually UCS-2? If they are UTF-16, how would I get the violin clef into my C# string?

推荐答案

/ p>

Use capital U instead:

  string s = "\U0001D11E";

你忽略大多数机器是小端:

And you overlooked that most machines are little-endian:

  string t = Encoding.Unicode.GetString(new byte[] { 0x34, 0xD8, 0x1E, 0xDD });

这篇关于定义字符串中的4字节UTF-16字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆