什么版本的统一的code支持其.NET平台和Windows的问候字符类哪个版本的? [英] What version of Unicode is supported by which .NET platform and on which version of Windows in regards to character classes?

查看:199
本文介绍了什么版本的统一的code支持其.NET平台和Windows的问候字符类哪个版本的?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

更新问题¹

至于字符类,比较,分类,标准化和归类,有什么统一code版或版本由.NET平台所支持?

With regards to character classes, comparison, sorting, normalization and collations, what Unicode version or versions are supported by which .NET platforms?

原题

我记得有些含糊地看了那.NET支持的统一code版本3.0,并且内部UTF-16编码是不是真的UTF-16,但实际使用UCS-2,这是不一样的。看起来,例如,该字符上述U + FFFF是不可能的,即考虑:

I remember somewhat vaguely having read that .NET supported Unicode version 3.0 and that the internal UTF-16 encoding is not really UTF-16 but actually uses UCS-2, which is not the same. It seems, for instance, that characters above U+FFFF are not possible, i.e. consider:

string s = "\u1D7D9"; // ("Mathematical double-struck digit one") 

和它存储字符串ᵽ9

基本上,我寻找答案的下列明确的引用:

I'm basically looking for definitive references of answers to the following:

  • 如果这不是在.NET中真正的UTF-16,它是什么?
  • 支持.NET?什么版本的统一code
  • 如果最新版本不支持或计划在不久的将来,没有任何人知道的(非)商业库或如何,我可以解决这个问题?

¹)我更新的问题,与路过的时候,它似乎更适合相对的答案,到更大的社区。我留在原地原来的问题,其中部分已在意见得到回答。同时老UCS-2(无代理人)是在现在,古老的32位Windows版本中使用,.NET总是用UTF-16(含代理人)内部。

¹) I updated the question as with passing time, it seems more appropriate with respect to the answers and to the larger community. I left the original question in place of which parts have been answered in the comments. Also the old UCS-2 (no surrogates) was used in now-ancient 32 bit Windows versions, .NET has always used UTF-16 (with surrogates) internally.

推荐答案

在内部,.NET是UTF-16。在一些情况下,例如当ASP.NET写的回应,默认情况下它使用UTF-8。他们都可以处理更高的飞机。

Internally, .NET is UTF-16. In some cases, e.g. when ASP.NET writes to a response, by default it uses UTF-8. Both of them can handle higher planes.

人们之所以有时把.NET作为UCS2是(我的认为的,因为我看到一些其他原因),该字符是严格的16位和一个字符不能被用来重新present上飞机。字符呢,然而,有静态的方法重载(例如 Char.IsLetter ),可以在高度,UTF-16字符的字符串内进行操作。字符串存储为真正的UTF-16。

The reason people sometimes refer to .NET as UCS2 is (I think, because I see few other reasons) that Char is strictly 16 bit and a single Char can't be used to represent the upper planes. Char does, however, have static method overloads (e.g. Char.IsLetter) that can operate on high plane UTF-16 characters inside a string. Strings are stored as true UTF-16.

您可以应对高的Uni code codepoints直接使用大写 \ U - 如\ U0001D7D9 - 但是,只在字符串,而不是字符

You can address high Unicode codepoints directly using uppercase \U - e.g. "\U0001D7D9" - but again, only inside strings, not chars.

至于统一code版本,从MSDN文档< /一>:

As for Unicode version, from the MSDN documentation:

在.NET Framework 4,整理,套管,标准化和统一code字符信息与Windows 7同步并符合的Uni code 5.1标准

更新1:值得一提的,但是,这并不意味着的全部的的统一code 5.1的支持 - 无论是在Windows 7中也没有在.NET 4.0

Update 1: It's worth noting, however, that this does not imply that the entirety of Unicode 5.1 is supported - neither in Windows 7 nor in .NET 4.0

<一个href="http://social.msdn.microsoft.com/Forums/en-US/winappswithcsharp/thread/05cd7b61-493d-4384-b709-b319e007332b/"相对=nofollow> Windows 8的目标统一code 6.0 - 我猜.NET框架4.5可能与同步的,但没有发现任何来源证实它。并再次,这并不意味着在整个标准被实现

Windows 8 targets Unicode 6.0 - I'm guessing that .NET Framework 4.5 might synchronize with that, but have found no sources confirming it. And once again, that doesn't mean the entire standard is implemented.

更新2:本说明上罗斯林确认底层平台定义了编译器的统一code的支持,并在<一个href="https://github.com/ufcpp/UfcppSample/blob/master/BreakingChanges/VS2015_CS6/KatakanaMiddleDot.cs"相对=nofollow>链接code 它解释说,C#6.0支持统一code ++ 6.0及以上(含重大更改为C#标识符作为一个结果)。

Update 2: This note on Roslyn confirms that the underlying platform defines the Unicode support for the compiler, and in the link to the code it explains that C# 6.0 supports Unicode 6.0 and up (with a breaking change for C# identifiers as a result).

更新3:由于.NET版本4.5的新类 SortVersion 介绍通过调用静态属性<得到支持的统一code版一个href="https://msdn.microsoft.com/en-us/library/system.globalization.sortversion.fullversion%28v=vs.110%29.aspx"相对=nofollow> SortVersion.FullVersion 。在同一页面,微软解释说,.NET 4.0支持所有平台上的统一code 5.0和.NET 4.5支持统一code 5.0的Windows 7和的Uni code 6.0的Windows 8。这稍微对比的官什么是新的声明此处,这分别版本5.x和6.0会谈。从我自己的(编辑:亚伯)的经验,在大多数情况下,它似乎在.NET 4.0中,统一code 5.1支持至少字符类,但我没有测试分选,规范化和归类。这似乎在说什么在 MSDN 行上面引用

Update 3: Since .NET version 4.5 a new class SortVersion is introduced to get the supported Unicode version by calling the static property SortVersion.FullVersion. On the same page, Microsoft explains that .NET 4.0 supports Unicode 5.0 on all platforms and .NET 4.5 supports Unicode 5.0 on Windows 7 and Unicode 6.0 on Windows 8. This slightly contrasts the official "what is new" statement here, which talks of version 5.x and 6.0 respectively. From my own (editor: Abel) experience, in most cases it seems that in .NET 4.0, Unicode 5.1 is supported at least for character classes, but I didn't test sorting, normalization and collations. This seems in line with what is said in MSDN as quoted above.

这篇关于什么版本的统一的code支持其.NET平台和Windows的问候字符类哪个版本的?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆