了解SQL Server归类中的Unicode和代码页 [英] Understand Unicode and code pages in SQL Server collations
问题描述
为什么所有SQL Server 2008 R2归类都关联到代码页。所有排序规则都是unicode吗?
Why all SQL Server 2008 R2 collations are associated to a code page. Are all collations unicode ?
当我们的数据库被使用不同代码页的几种语言使用时,如何选择排序规则?
How to choose a collation when our database is used by several languages using differents code pages ?
谢谢。
推荐答案
CHAR与NCHAR(即,非Unicode与Unicode)定义了字符存储编码。排序规则定义...排序规则(即排序顺序和比较规则)。它们是不同的概念,尽管经常会引起混淆。
CHAR vs. NCHAR (ie. Non-Unicode vs. Unicode) defines the character storage encoding. Collations define... collation (ie. sort order and comparison rules). They are different concepts, although often confused.
这种混淆源于以下事实:客户端工具将非Unicode数据归类为提示选择数据的代码页。请参见代码页体系结构。这意味着像ADO.Net SqlClient这样的客户端可以正确地将从服务器接收的单字节 CHAR数据编码为多字节 string
.Net对象。列元数据将包含使用的排序规则,因此客户端将知道如何根据特定的代码页解释单字节数据。
The confusion stems from the fact that the client tools use the collation of non-Unicode data as hint to choose the code page of the data. See Code Page Architecture. This means that a client like ADO.Net SqlClient can properly encode the single-byte CHAR data received from the server as a multi-byte string
.Net object. The column metadata will contain the collation used and so the client will know how to interpret the single-byte data according to a specific code page.
对于Unicode(NCHAR)列客户端不需要根据代码页来解释数据,数据本身已经是多字节,并且客户端将根据UCS-2编码(SQL Server使用的Unicode的实际风格)来解释数据。
For Unicode (NCHAR) columns the client does not need to interpret the data according to a code page, the data itself is already multi-byte and the client will interpret it according to the UCS-2 encoding (the actual flavor of Unicode used by SQL Server).
但是不要将其与排序规则实际混淆:比较字符的规则。如使用归类所述:
However do not confuse this with what collations actually are: rules for comparing characters. As described in Working with Collations:
讲英语的人会期望字符串 Chiapas以升序出现在 Colima之前。但是,在墨西哥讲西班牙语的人可能希望以 Ch开头的单词出现在以 C开头的单词列表的末尾。归类规定了这些排序和比较规则。 Latin_1 General归类将在ORDER BY ASC子句中对'Colima'进行排序,而Traditional_Spanish归类将对'Colima'进行对'Chiapas'排序。
an English speaker would expect the character string 'Chiapas' to come before 'Colima' in ascending order. However, a Spanish speaker in Mexico might expect words beginning with 'Ch' to appear at the end of a list of words starting with 'C'. Collations dictate these kinds of sorting and comparison rules. The Latin_1 General collation will sort 'Chiapas' before 'Colima' in an ORDER BY ASC clause, whereas the Traditional_Spanish collation will sort 'Chiapas' after 'Colima'.
此排序规则适用于任何数据类型(CHAR非Unicode或NCHAR Unicode)。
This sorting rule applies to any data type (CHAR non-Unicode or NCHAR Unicode).
这篇关于了解SQL Server归类中的Unicode和代码页的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!