什么版本的统一的code支持其.NET平台和Windows的问候字符类哪个版本的? [英] What version of Unicode is supported by which .NET platform and on which version of Windows in regards to character classes?
问题描述
更新问题¹
至于字符类,比较,分类,标准化和归类,有什么统一code版或版本由.NET平台所支持?
With regards to character classes, comparison, sorting, normalization and collations, what Unicode version or versions are supported by which .NET platforms?
原题
我记得有些含糊地看了那.NET支持的统一code版本3.0,并且内部UTF-16编码是不是真的UTF-16,但实际使用UCS-2,这是不一样的。看起来,例如,该字符上述U + FFFF是不可能的,即考虑:
I remember somewhat vaguely having read that .NET supported Unicode version 3.0 and that the internal UTF-16 encoding is not really UTF-16 but actually uses UCS-2, which is not the same. It seems, for instance, that characters above U+FFFF are not possible, i.e. consider:
string s = "\u1D7D9"; // ("Mathematical double-struck digit one")
和它存储字符串ᵽ9
。
基本上,我寻找答案的下列明确的引用:
I'm basically looking for definitive references of answers to the following:
- 如果这不是在.NET中真正的UTF-16,它是什么?
- 支持.NET?什么版本的统一code
- 如果最新版本不支持或计划在不久的将来,没有任何人知道的(非)商业库或如何,我可以解决这个问题?
¹)我更新的问题,与路过的时候,它似乎更适合相对的答案,到更大的社区。我留在原地原来的问题,其中部分已在意见得到回答。同时老UCS-2(无代理人)是在现在,古老的32位Windows版本中使用,.NET总是用UTF-16(含代理人)内部。
¹) I updated the question as with passing time, it seems more appropriate with respect to the answers and to the larger community. I left the original question in place of which parts have been answered in the comments. Also the old UCS-2 (no surrogates) was used in now-ancient 32 bit Windows versions, .NET has always used UTF-16 (with surrogates) internally.
推荐答案
在内部,.NET是UTF-16。在一些情况下,例如当ASP.NET写的回应,默认情况下它使用UTF-8。他们都可以处理更高的飞机。
Internally, .NET is UTF-16. In some cases, e.g. when ASP.NET writes to a response, by default it uses UTF-8. Both of them can handle higher planes.
人们之所以有时把.NET作为UCS2是(我的认为的,因为我看到一些其他原因),该字符是严格的16位和一个字符不能被用来重新present上飞机。字符呢,然而,有静态的方法重载(例如 Char.IsLetter
),可以在高度,UTF-16字符的字符串内进行操作。字符串存储为真正的UTF-16。
The reason people sometimes refer to .NET as UCS2 is (I think, because I see few other reasons) that Char is strictly 16 bit and a single Char can't be used to represent the upper planes. Char does, however, have static method overloads (e.g. Char.IsLetter
) that can operate on high plane UTF-16 characters inside a string. Strings are stored as true UTF-16.
您可以应对高的Uni code codepoints直接使用大写 \ U
- 如\ U0001D7D9
- 但是,只在字符串,而不是字符
You can address high Unicode codepoints directly using uppercase \U
- e.g. "\U0001D7D9"
- but again, only inside strings, not chars.
至于统一code版本,从MSDN文档< /一>:
As for Unicode version, from the MSDN documentation:
在.NET Framework 4,整理,套管,标准化和统一code字符信息与Windows 7同步并符合的Uni code 5.1标准的
更新1:值得一提的,但是,这并不意味着的全部的的统一code 5.1的支持 - 无论是在Windows 7中也没有在.NET 4.0
Update 1: It's worth noting, however, that this does not imply that the entirety of Unicode 5.1 is supported - neither in Windows 7 nor in .NET 4.0
<一个href="http://social.msdn.microsoft.com/Forums/en-US/winappswithcsharp/thread/05cd7b61-493d-4384-b709-b319e007332b/"相对=nofollow> Windows 8的目标统一code 6.0 - 我猜.NET框架4.5可能与同步的,但没有发现任何来源证实它。并再次,这并不意味着在整个标准被实现
Windows 8 targets Unicode 6.0 - I'm guessing that .NET Framework 4.5 might synchronize with that, but have found no sources confirming it. And once again, that doesn't mean the entire standard is implemented.
更新2:的本说明上罗斯林确认底层平台定义了编译器的统一code的支持,并在<一个href="https://github.com/ufcpp/UfcppSample/blob/master/BreakingChanges/VS2015_CS6/KatakanaMiddleDot.cs"相对=nofollow>链接code 它解释说,C#6.0支持统一code ++ 6.0及以上(含重大更改为C#标识符作为一个结果)。
Update 2: This note on Roslyn confirms that the underlying platform defines the Unicode support for the compiler, and in the link to the code it explains that C# 6.0 supports Unicode 6.0 and up (with a breaking change for C# identifiers as a result).
更新3:由于.NET版本4.5的新类 SortVersion
介绍通过调用静态属性<得到支持的统一code版一个href="https://msdn.microsoft.com/en-us/library/system.globalization.sortversion.fullversion%28v=vs.110%29.aspx"相对=nofollow> SortVersion.FullVersion
。在同一页面,微软解释说,.NET 4.0支持所有平台上的统一code 5.0和.NET 4.5支持统一code 5.0的Windows 7和的Uni code 6.0的Windows 8。这稍微对比的官什么是新的声明此处,这分别版本5.x和6.0会谈。从我自己的(编辑:亚伯)的经验,在大多数情况下,它似乎在.NET 4.0中,统一code 5.1支持至少字符类,但我没有测试分选,规范化和归类。这似乎在说什么在 MSDN 行上面引用
Update 3: Since .NET version 4.5 a new class SortVersion
is introduced to get the supported Unicode version by calling the static property SortVersion.FullVersion
. On the same page, Microsoft explains that .NET 4.0 supports Unicode 5.0 on all platforms and .NET 4.5 supports Unicode 5.0 on Windows 7 and Unicode 6.0 on Windows 8. This slightly contrasts the official "what is new" statement here, which talks of version 5.x and 6.0 respectively. From my own (editor: Abel) experience, in most cases it seems that in .NET 4.0, Unicode 5.1 is supported at least for character classes, but I didn't test sorting, normalization and collations. This seems in line with what is said in MSDN as quoted above.
这篇关于什么版本的统一的code支持其.NET平台和Windows的问候字符类哪个版本的?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!