如何在SQL Server中获得排序规则的区分大小写的版本? [英] How to get a case sensitive version of a collation in SQL Server?

查看:93
本文介绍了如何在SQL Server中获得排序规则的区分大小写的版本?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否有一种方法可以获取区分大小写的排序规则以在查询中使用?



假设查询可以用于具有不同查询类型的数据库排序规则,其中一些不区分大小写,并且可以具有不同的文化。 (例如,多个客户端)



但是,此查询应始终以区分大小写的方式运行,并且在可能的情况下,请勿更改排序规则区域性和其他属性。



例如,如果某个数据库恰巧正在使用SQL_Latin1_General_CP1_CI_AS(CI在此表示不区分大小写),我想使用SQL_Latin1_General_CP1_CS_AS(对于区分大小写的CS)。



简单查询示例:

  DECLARE @Title nvarchar(2)='qQ '

-不区分大小写(在数据库排序规则之后)
SELECT REPLACE(@Title,'q','o')-结果:'oo'

-区分大小写,但固定为排序规则
SELECT REPLACE(@Title COLLATE SQL_Latin1_General_CP1_CS_AS,'q','o')-结果:'oQ'

在查询中修复这样的排序规则可能会在迁移代码或在以后更改数据库排序规则时引起问题。



是否有内置函数来获取区分大小写的versi

解决方案

排序规则不一定由数据库默认值决定值:也可以在每个字符串字段中设置它们。



不,我从未见过(而且我已经看过)在使用Dynamic SQL之外进行动态排序的方法将 COLLATE 子句写入查询。或者,如果您需要考虑的选项数量很少,您可以 尝试以下操作:

  SELECT ... 
FROM ...
WHERE(@CaseSensitive = 1 AND [Field]喜欢N'%'+ @Name + N' %'COLLATE Something_CS_AS)
或(@CaseSensitive = 0 AND [Field]喜欢N'%'+ @Name + N'%')

此外,区分大小写(或什至重音,假名或宽度)和不区分大小写之间都没有直接对等。虽然大多数情况下,不区分大小写的排序规则都有区分大小写的对等形式,但有15种排序规则仅区分大小写:

 ;其中用例

SELECT [name]
FROM sys.fn_helpcollat​​ions()
WHERE [name] LIKE N' %[_] cs [_]%'

选择CaseI。*
来自sys.fn_helpcollat​​ions()CaseI
左加入CaseS
ON CaseI.name = REPLACE(CaseS。[name],N'_CS _',N'_CI_')
WHERE CaseI。[name]喜欢N'%[_] ci [_]%'
AND CaseS。[name ] 一片空白;

返回值:

 名称描述
SQL_1xCompat_CP850_CI_AS ...
SQL_AltDiction_CPref_CP850_CI_AS ...
SQL_AltDiction_Pref_CP850_CI_AS ...
SQL_Danish_Pref_CP1_CI_AS ...
$ SQL ..
SQL_Latin1_General_Pref_CP850_CI_AS ...
SQL_Scandinavian_Pref_CP850_CI_AS ...
SQL_SwedishPhone_Pref_CP1_CI_AS ...
SQL_SwedishStd_Pref_CP1_CI_AS ...



在查询中修复这样的排序规则可能会在迁移c时引起问题颂歌,


为什么?您打算将代码迁移到哪里?如果使用另一个RDBMS,则您已经需要应对数据类型差异,SQL方言差异,最佳实践差异等。那么为什么还要担心排序规则呢?除非您确定要迁移到另一个RDBMS,否则应通过最大程度地使用当前平台来使系统尽可能地发挥最佳性能,而不是由于以下原因而处于非最佳状态:仅使用最低注释分母功能。


或在以后更改数据库排序规则。


您为什么要这样做?同样,任何具有显式COLLATION设置的字符串字段都不受数据库默认值的影响。






寻找 strict 大小写(以及包括口音等在内的所有内容)对对等的敏感性(我们不是不是在谈论范围搜索或排序),那么您可以使用二进制排序规则(即以 _BIN _BIN2 结尾的排序规则)。请记住,二进制排序规则可能不是您期望的排序方式,因为它们不是基于字典的排序方式,至少不是针对在所有语言中表现相同的单个二进制排序规则而言。他们也不会在语言之间进行对等(例如,将 a等同于带有重音符号的 a)。



这个答案我发现上面的段落实际上是不好的建议。如果目标是区分大小写,请不要使用二进制排序规则。太严格了,在许多情况下不会给出准确的结果。有关详细信息和示例,请参阅:否,二进制排序规则不区分大小写。



此外,请不要使用仅以<$ c $结尾的二进制排序规则c> _BIN ,因为它们自SQL Server 2005发布以来就已经过时,并且仅在需要保持与也使用 _BIN 排序规则。如果需要二进制排序规则,请使用以 _BIN2 结尾的。有关详细信息和示例,请参阅:各种二进制排序规则(文化,版本和BIN与BIN2之间的区别)






更新



我能够想出一个函数来区分大小写(如果存在)传入的归类。但是,此功能仅有助于创建正确的动态SQL。它不能在查询中内联使用以动态设置COLLATE子句(主要是因为不能这样做)。有两个参数:




  • @Collat​​ionName -如果您将其传递进来,如果存在的话,您将取回它的区分大小写的版本。 @DatabaseName 参数将被忽略。

  • @DatabaseName -如果您不知道确切的排序规则,将 @Collat​​ionName 保留为 NULL 并将其传入,它将查找该数据库的默认排序规则。

  • 如果两个参数都为 NULL ,则它将查找该函数存在的数据库的默认排序规则

  • 如果传入或查找的排序规则已经区分大小写,则将返回该名称

  • 要做的事情(当我有时间):查找没有默认数据库的服务器默认排序规则(它们的默认排序规则名称为 NULL



该函数有两个版本:第一个是TVF(因为速度更快)和Scalar UDF(因为它们有时更易于交互)。



表值函数:

  U SE [测试]; 
SET ANSI_NULLS ON;

IF(OBJECT_ID(N'dbo.GetCaseSensitiveCollat​​ion')不为空)
开始
删除功能dbo.GetCaseSensitiveCollat​​ion;
END;

GO
创建函数dbo.GetCaseSensitiveCollat​​ion

@Collat​​ionName sysname,
@DatabaseName sysname

返回表
-与SCHEMABINDING
-无法架构绑定表值函数'dbo.GetCaseSensitiveCollat​​ion'
-因为它引用了系统对象'sys.fn_helpcollat​​ions'。
作为返回

具有排序规则(名称)AS

SELECT CONVERT(sysname,COALESCE(@Collat​​ionName,
DATABASEPROPERTYEX(COALESCE(@DatabaseName, DB_NAME(),'Collat​​ion')))

SELECT col。[name]
FROM sys.fn_helpcollat​​ions()col
交叉联接归类
WHERE列。[name] =整理时的情况。[name]喜欢N'%[_] CS [_]%'
THEN整理。[name]
ELSE REPLACE(collat​​ion。[name],N' _CI_',N'_CS_')
END;
GO

示例:

 -获取指定排序规则的CS排序规则
SELECT [name] AS [BySpecificCollat​​ion]
FROM dbo.GetCaseSensitiveCollat​​ion(N'Indic_General_100_CI_AS_KS_WS ', 空值);

-根据指定数据库的数据库默认值获取CS整理
SELECT [name] AS [ByDefaultCollat​​ionForDB]
FROM dbo.GetCaseSensitiveCollat​​ion(NULL,N'msdb');

-根据数据库默认值获取该函数存在于数据库中的数据库的CS归类SELECT [name] AS [CurrentDB]
FROM Test.dbo.GetCaseSensitiveCollat​​ion(NULL,NULL );

-根据当前数据库的数据库默认值获取CS整理
USE [ReportServer];
选择[名称] AS [CurrentDB]
FROM Test.dbo.GetCaseSensitiveCollat​​ion(NULL,DB_NAME());

标量用户定义功能:

  USE [Test]; 
SET ANSI_NULLS ON;

IF(OBJECT_ID(N'dbo.GetCaseSensitiveCollat​​ion2')不为空)
BEGIN
DROP FUNCTION dbo.GetCaseSensitiveCollat​​ion2;
END;
GO
创建函数dbo.GetCaseSensitiveCollat​​ion2

@Collat​​ionName sysname,
@DatabaseName sysname

返回sysname
- -WITH SCHEMABINDING
-无法模式绑定表值函数'dbo.GetCaseSensitiveCollat​​ion2'
-因为它引用系统对象'sys.fn_helpcollat​​ions'。
AS
开始
声明@NewCollat​​ionName sysname;

;使用排序规则(名称)AS

SELECT CONVERT(sysname,COALESCE(@Collat​​ionName,
DATABASEPROPERTYEX(COALESCE(@DatabaseName,DB_NAME()), ''Collat​​ion')))

SELECT @NewCollat​​ionName = col。[name]
FROM sys.fn_helpcollat​​ions()col
交叉联接归类
WHERE列。[ name] =整理时的情况。[name]喜欢N'%[_] CS [_]%'
THEN排序规则。[name]
ELSE REPLACE(collat​​ion。[name],N'_CI_' ,N'_CS_')
END;

NewCollat​​ionName返回;
END;
GO

示例:

  / *获取指定归类的CS归类* / 
SELECT dbo.GetCaseSensitiveCollat​​ion2(N'Indic_General_100_CI_AS_KS_WS',NULL)
AS [ BySpecificCollat​​ion];
-Indic_General_100_CS_AS_KS_WS

/ *根据指定数据库的数据库默认值获取CS整理* /
SELECT dbo.GetCaseSensitiveCollat​​ion2(NULL,N'msdb')AS [ByDefaultCollat​​ionForDB] ;
-SQL_Latin1_General_CP1_CS_AS

/ *根据当前数据库的数据库默认值获取CS整理* /
USE [ReportServer];
SELECT Test.dbo.GetCaseSensitiveCollat​​ion2(NULL,DB_NAME())AS [CurrentDB];
-Latin1_General_CS_AS_KS_WS

/ *根据存在该功能的数据库的数据库默认值获取CS整理* /
SELECT Test.dbo.GetCaseSensitiveCollat​​ion2(NULL,NULL)AS [DBthatFunctionExistsIn ];
-SQL_Latin1_General_CP1_CS_AS


Is there a way to get a case-sensitive version of a collation to use in a query?

Let's say that the query could be used on databases with different collations, some which are case-insensitive, and can have different cultures. (multiple clients for example)

However, this query should always behave in a case-sensitive manner, while, if possible, not changing the collation culture and other properties.

For example, if a DB happens to be using SQL_Latin1_General_CP1_CI_AS (CI here stands for Case Insensitive), I would like to use SQL_Latin1_General_CP1_CS_AS (CS for Case Sensitive).

Simplistic query example:

DECLARE @Title nvarchar(2) = 'qQ'

--Case insensitive (following DB collation)
SELECT REPLACE(@Title, 'q', 'o') --Result: 'oo'

--Case sensitive, but fixed to a collation
SELECT REPLACE(@Title COLLATE SQL_Latin1_General_CP1_CS_AS, 'q', 'o') --Result: 'oQ'

Fixing a collation like this in the query could cause problems when migrating the code, or changing the DB collation at a latter date.

Is there a built-in function to get the case-sensitive version of the current collation, or a workaround that could be used for this?

解决方案

Collations are not necessarily determined by the Database default value: they can be set per string field as well.

No, I have never seen a way (and I have looked) to do dynamic collations outside of using Dynamic SQL to write the COLLATE clause into a query. Or, if the number of options you need to account for are fairly minimal, you could maybe try something like the following:

SELECT ...
FROM   ...
WHERE (@CaseSensitive = 1 AND [Field] LIKE N'%' + @Name + N'%' COLLATE Something_CS_AS)
OR (@CaseSensitive = 0 AND [Field] LIKE N'%' + @Name + N'%')

Also, there is no direct equivalence between Case (or even Accent, Kana, or Width) sensitive and insensitive. While most of the time there is a case-sensitive counterpart to a case-insensitive collation, there are 15 collations that are case-insensitive-only:

;WITH CaseS AS
(
  SELECT [name]
  FROM   sys.fn_helpcollations()
  WHERE  [name] LIKE N'%[_]cs[_]%'
)
SELECT CaseI.*
FROM   sys.fn_helpcollations() CaseI
LEFT JOIN CaseS
       ON CaseI.name = REPLACE(CaseS.[name], N'_CS_', N'_CI_')
WHERE  CaseI.[name] LIKE N'%[_]ci[_]%'
AND    CaseS.[name] IS NULL;

Returns:

name                                  description
SQL_1xCompat_CP850_CI_AS              ...
SQL_AltDiction_CP850_CI_AI            ...
SQL_AltDiction_Pref_CP850_CI_AS       ...
SQL_Danish_Pref_CP1_CI_AS             ...
SQL_Icelandic_Pref_CP1_CI_AS          ...
SQL_Latin1_General_CP1_CI_AI          ...
SQL_Latin1_General_CP1253_CI_AI       ...
SQL_Latin1_General_CP437_CI_AI        ...
SQL_Latin1_General_CP850_CI_AI        ...
SQL_Latin1_General_Pref_CP1_CI_AS     ...
SQL_Latin1_General_Pref_CP437_CI_AS   ...
SQL_Latin1_General_Pref_CP850_CI_AS   ...
SQL_Scandinavian_Pref_CP850_CI_AS     ...
SQL_SwedishPhone_Pref_CP1_CI_AS       ...
SQL_SwedishStd_Pref_CP1_CI_AS         ...

Fixing a collation like this in the query could cause problems when migrating the code,

Why? Where are you planning on migrating the code to? If to another RDBMS, then you already need to contend with datatype differences, SQL dialect differences, "best practices" differences, etc. So why worry about collations? Unless you know for certain that you will be migrating to another RDBMS, you should make your system work as best as it can by using your current platform to the best of its abilities, rather than existing in a less-than-optimal state due to only using lowest-comment-denominator functionality.

or changing the DB collation at a latter date.

Why would you do this? Again, any string fields with an explicit COLLATION setting are not affected by the database default.


If you are looking for strict Case (and everything including Accent, etc) sensitivity on equivalence (we are not talking about range searches or sorting), then you can use a Binary collation (i.e. one ending in either _BIN or _BIN2). Just keep in mind that binary collations might not sort the way you might expect since they are not "dictionary" based sorts, at least not in terms of a single binary collation that would behave the same across all languages. They also don't make equivalences between languages (i.e. equating "a" with an "a" that has an accent).

Since the original posting of this answer I have discovered that the paragraph above is actually bad advice. Please do not use a binary collation if the goal is case-sensitivity. It is too strict and in many cases will not give accurate results. For details and examples, please see: No, Binary Collations are not Case-Sensitive.

Also, please do not use binary collations ending in just _BIN as they have been obsolete since SQL Server 2005 was released and should only be used when needing to maintain backwards compatibility with another system that is also using a _BIN collation. If you need a binary collation, use one ending in _BIN2. For details and examples, please see: Differences Between the Various Binary Collations (Cultures, Versions, and BIN vs BIN2).


UPDATE

I was able to come up with a function to get the case sensitive version, if one exists, of the passed-in collation. This function, however, will only assist in creating the correct Dynamic SQL; it cannot be used inline in a query to set the COLLATE clause dynamically (mainly because that cannot be done). There are two parameters:

  • @CollationName -- if you pass this in, you will get back the case-sensitive version of it, if one exists. The @DatabaseName param will be ignored.
  • @DatabaseName -- if you don't know the exact collation, leave @CollationName as NULL and pass this in and it will look up the default collation for that database.
  • If both params are NULL then it will look up the default collation for the database that the function exists in.
  • If the passed-in or looked-up collation is already case-sensitive, that name will be returned
  • TO DO (when I have time): look up server default collation for databases that do not have a default (they will have NULL as their default collation name)

There are two versions of the function: the first is a TVF (as those are faster) and a Scalar UDF (as those are sometimes easier to interact with).

Table-Valued Function:

USE [Test];
SET ANSI_NULLS ON;

IF (OBJECT_ID(N'dbo.GetCaseSensitiveCollation') IS NOT NULL)
BEGIN
  DROP FUNCTION dbo.GetCaseSensitiveCollation;
END;

GO
CREATE FUNCTION dbo.GetCaseSensitiveCollation
(
  @CollationName sysname,
  @DatabaseName sysname
)
RETURNS TABLE
--WITH SCHEMABINDING
--     Cannot schema bind table valued function 'dbo.GetCaseSensitiveCollation'
--     because it references system object 'sys.fn_helpcollations'.
AS RETURN

  WITH collation(name) AS
  (
    SELECT CONVERT(sysname, COALESCE(@CollationName,
                DATABASEPROPERTYEX(COALESCE(@DatabaseName, DB_NAME()), 'Collation')))
  )
  SELECT col.[name]
  FROM   sys.fn_helpcollations() col
  CROSS JOIN collation
  WHERE  col.[name] = CASE WHEN collation.[name] LIKE N'%[_]CS[_]%' 
                               THEN collation.[name]
                           ELSE REPLACE(collation.[name], N'_CI_', N'_CS_')
                      END;
GO

Examples:

-- Get CS Collation for the specified Collation
SELECT [name] AS [BySpecificCollation]
FROM dbo.GetCaseSensitiveCollation(N'Indic_General_100_CI_AS_KS_WS', NULL);

-- Get CS Collation based on database default for the specified database
SELECT [name] AS [ByDefaultCollationForDB]
FROM dbo.GetCaseSensitiveCollation(NULL, N'msdb');

-- Get CS Collation based on database default for database that the function exists in
SELECT [name] AS [CurrentDB]
FROM Test.dbo.GetCaseSensitiveCollation(NULL, NULL);

-- Get CS Collation based on database default for the current database
USE [ReportServer];
SELECT [name] AS [CurrentDB]
FROM Test.dbo.GetCaseSensitiveCollation(NULL, DB_NAME());

Scalar User-Defined Function:

USE [Test];
SET ANSI_NULLS ON;

IF (OBJECT_ID(N'dbo.GetCaseSensitiveCollation2') IS NOT NULL)
BEGIN
  DROP FUNCTION dbo.GetCaseSensitiveCollation2;
END;
GO
CREATE FUNCTION dbo.GetCaseSensitiveCollation2
(
  @CollationName sysname,
  @DatabaseName sysname
)
RETURNS sysname
--WITH SCHEMABINDING
--     Cannot schema bind table valued function 'dbo.GetCaseSensitiveCollation2'
--     because it references system object 'sys.fn_helpcollations'.
AS
BEGIN
  DECLARE @NewCollationName sysname;

  ;WITH collation(name) AS
  (
    SELECT CONVERT(sysname, COALESCE(@CollationName,
                DATABASEPROPERTYEX(COALESCE(@DatabaseName, DB_NAME()), 'Collation')))
  )
  SELECT @NewCollationName = col.[name]
  FROM   sys.fn_helpcollations() col
  CROSS JOIN collation
  WHERE  col.[name] = CASE WHEN collation.[name] LIKE N'%[_]CS[_]%'
                                THEN collation.[name]
                           ELSE REPLACE(collation.[name], N'_CI_', N'_CS_')
                      END;

  RETURN @NewCollationName;
END;
GO

Examples:

/* Get CS Collation for the specified Collation */
SELECT dbo.GetCaseSensitiveCollation2(N'Indic_General_100_CI_AS_KS_WS', NULL)
                 AS [BySpecificCollation];
-- Indic_General_100_CS_AS_KS_WS

/* Get CS Collation based on database default for the specified database */
SELECT dbo.GetCaseSensitiveCollation2(NULL, N'msdb') AS [ByDefaultCollationForDB];
-- SQL_Latin1_General_CP1_CS_AS

/* Get CS Collation based on database default for the current database */
USE [ReportServer];
SELECT Test.dbo.GetCaseSensitiveCollation2(NULL, DB_NAME()) AS [CurrentDB];
-- Latin1_General_CS_AS_KS_WS

/* Get CS Collation based on database default for database where the function exists */
SELECT Test.dbo.GetCaseSensitiveCollation2(NULL, NULL) AS [DBthatFunctionExistsIn];
-- SQL_Latin1_General_CP1_CS_AS

这篇关于如何在SQL Server中获得排序规则的区分大小写的版本?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆