了解SQL Server归类中的Unicode和代码页 [英] Understand Unicode and code pages in SQL Server collations

查看：81 发布时间：2020/10/5 5:38:37 sql-server sql-server-2008-r2 collation

本文介绍了了解SQL Server归类中的Unicode和代码页的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

为什么所有SQL Server 2008 R2归类都关联到代码页。所有排序规则都是unicode吗？

Why all SQL Server 2008 R2 collations are associated to a code page. Are all collations unicode ?

当我们的数据库被使用不同代码页的几种语言使用时，如何选择排序规则？

How to choose a collation when our database is used by several languages using differents code pages ?

谢谢。

推荐答案

CHAR与NCHAR（即，非Unicode与Unicode）定义了字符存储编码。排序规则定义...排序规则（即排序顺序和比较规则）。它们是不同的概念，尽管经常会引起混淆。

CHAR vs. NCHAR (ie. Non-Unicode vs. Unicode) defines the character storage encoding. Collations define... collation (ie. sort order and comparison rules). They are different concepts, although often confused.

这种混淆源于以下事实：客户端工具将非Unicode数据归类为提示选择数据的代码页。请参见代码页体系结构。这意味着像ADO.Net SqlClient这样的客户端可以正确地将从服务器接收的单字节 CHAR数据编码为多字节 string .Net对象。列元数据将包含使用的排序规则，因此客户端将知道如何根据特定的代码页解释单字节数据。

The confusion stems from the fact that the client tools use the collation of non-Unicode data as hint to choose the code page of the data. See Code Page Architecture. This means that a client like ADO.Net SqlClient can properly encode the single-byte CHAR data received from the server as a multi-byte string .Net object. The column metadata will contain the collation used and so the client will know how to interpret the single-byte data according to a specific code page.

对于Unicode（NCHAR）列客户端不需要根据代码页来解释数据，数据本身已经是多字节，并且客户端将根据UCS-2编码（SQL Server使用的Unicode的实际风格）来解释数据。

For Unicode (NCHAR) columns the client does not need to interpret the data according to a code page, the data itself is already multi-byte and the client will interpret it according to the UCS-2 encoding (the actual flavor of Unicode used by SQL Server).

但是不要将其与排序规则实际混淆：比较字符的规则。如使用归类所述：

However do not confuse this with what collations actually are: rules for comparing characters. As described in Working with Collations:

讲英语的人会期望字符串 Chiapas以升序出现在 Colima之前。但是，在墨西哥讲西班牙语的人可能希望以 Ch开头的单词出现在以 C开头的单词列表的末尾。归类规定了这些排序和比较规则。 Latin_1 General归类将在ORDER BY ASC子句中对'Colima'进行排序，而Traditional_Spanish归类将对'Colima'进行对'Chiapas'排序。

an English speaker would expect the character string 'Chiapas' to come before 'Colima' in ascending order. However, a Spanish speaker in Mexico might expect words beginning with 'Ch' to appear at the end of a list of words starting with 'C'. Collations dictate these kinds of sorting and comparison rules. The Latin_1 General collation will sort 'Chiapas' before 'Colima' in an ORDER BY ASC clause, whereas the Traditional_Spanish collation will sort 'Chiapas' after 'Colima'.

此排序规则适用于任何数据类型（CHAR非Unicode或NCHAR Unicode）。

This sorting rule applies to any data type (CHAR non-Unicode or NCHAR Unicode).

这篇关于了解SQL Server归类中的Unicode和代码页的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

了解SQL Server归类中的Unicode和代码页 [英] Understand Unicode and code pages in SQL Server collations

问题描述

推荐答案

相关文章

数据库最新文章

热门教程

热门工具

登录关闭

了解SQL Server归类中的Unicode和代码页 [英] Understand Unicode and code pages in SQL Server collations

问题描述

推荐答案

相关文章

数据库最新文章

热门教程

热门工具

登录 关闭

登录关闭