为什么SQL_Latin1_General_CP1_CI_AS在下划线前排序数字符号? [英] Why does SQL_Latin1_General_CP1_CI_AS sort number-sign before underscore?
问题描述
在 https://stackoverflow.com/a/32233795/14731 之后,我惊讶地发现:
DECLARE @SampleData TABLE (ANSI VARCHAR(50), UTF16 NVARCHAR(50));
INSERT INTO @SampleData (ANSI, UTF16) VALUES
('##MS_PolicyTsqlExecutionLogin##', N'##MS_PolicyTsqlExecutionLogin##'),
('_gaia', N'_gaia');
SELECT sd.ANSI AS [ANSI-SQL_Latin1_General_CP1_CI_AS]
FROM @SampleData sd
ORDER BY sd.ANSI COLLATE SQL_Latin1_General_CP1_CI_AS ASC;
SELECT sd.UTF16 AS [UTF16-SQL_Latin1_General_CP1_CI_AS]
FROM @SampleData sd
ORDER BY sd.UTF16 COLLATE SQL_Latin1_General_CP1_CI_AS ASC;
结果:
ANSI-SQL_Latin1_General_CP1_CI_AS
-------------------------------------
##MS_PolicyTsqlExecutionLogin##
_gaia
UTF16-SQL_Latin1_General_CP1_CI_AS
-------------------------------------
##MS_PolicyTsqlExecutionLogin##
_gaia
根据"为什么ICU4J不匹配UTF-8排序顺序?"时,Unicode结果应该可以处于相反的顺序.为什么会这样?
事实证明, @一二三是关于SQL Server没有实现默认的Unicode排序算法规则是正确的,但是他对使用代码页进行unicode排序的想法是错误的. https://stackoverflow.com/a/32706510/14731 包含有关如何真正实现unicode排序的详细说明./p>
Following up on https://stackoverflow.com/a/32233795/14731, I was surprised to discover that:
DECLARE @SampleData TABLE (ANSI VARCHAR(50), UTF16 NVARCHAR(50));
INSERT INTO @SampleData (ANSI, UTF16) VALUES
('##MS_PolicyTsqlExecutionLogin##', N'##MS_PolicyTsqlExecutionLogin##'),
('_gaia', N'_gaia');
SELECT sd.ANSI AS [ANSI-SQL_Latin1_General_CP1_CI_AS]
FROM @SampleData sd
ORDER BY sd.ANSI COLLATE SQL_Latin1_General_CP1_CI_AS ASC;
SELECT sd.UTF16 AS [UTF16-SQL_Latin1_General_CP1_CI_AS]
FROM @SampleData sd
ORDER BY sd.UTF16 COLLATE SQL_Latin1_General_CP1_CI_AS ASC;
Results in:
ANSI-SQL_Latin1_General_CP1_CI_AS
-------------------------------------
##MS_PolicyTsqlExecutionLogin##
_gaia
UTF16-SQL_Latin1_General_CP1_CI_AS
-------------------------------------
##MS_PolicyTsqlExecutionLogin##
_gaia
When, according to "Why doesn't ICU4J match UTF-8 sort order?", the Unicode results are supposed to be in the opposite order. Why is this the case?
It turns out that @一二三 is right about SQL Server not implementing the default Unicode Collation Algorithm rules, but he was wrong about it using a codepage for unicode sorting. https://stackoverflow.com/a/32706510/14731 contains a detailed explanation of how unicode sorting is really implemented.
这篇关于为什么SQL_Latin1_General_CP1_CI_AS在下划线前排序数字符号?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!