唯一索引,varchar列和(空白)空间的行为 [英] Behavior of unique index, varchar column and (blank) spaces

查看:165
本文介绍了唯一索引,varchar列和(空白)空间的行为的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用的是Microsoft SQL Server 2008 R2(带有最新的Service Pack/补丁),数据库排序规则为SQL_Latin1_General_CP1_CI_AS.

以下代码:

SET ANSI_PADDING ON;
GO

CREATE TABLE Test (
   Code VARCHAR(16) NULL
);
CREATE UNIQUE INDEX UniqueIndex
    ON Test(Code);

INSERT INTO Test VALUES ('sample');
INSERT INTO Test VALUES ('sample ');

SELECT '>' + Code + '<' FROM Test WHERE Code = 'sample        ';
GO

产生以下结果:

(受影响的1行)

消息2601,第14层,状态1,第8行

无法在具有唯一索引'UniqueIndex'的对象'dbo.Test'中插入重复的键行.重复的键值为(sample).

该语句已终止.

--------------------------------

>样本<

(受影响的1行)

我的问题是:

  1. 我认为索引不能存储尾随空格.谁能指出我指定/定义此行为的官方文档?
  2. 是否存在更改此行为的设置,即,使其将"sample"和"sample"识别为两个不同的值(顺便说一下,它们是两个值),所以两者都可以在索引中.
  3. 为什么SELECT返回地球? SQL Server必须使用WHERE子句中的空格来做一些有趣的事情,因为如果删除索引中的唯一性,则两个INSERT都将运行OK,而SELECT将返回两行!

任何在正确方向上的帮助/指针将不胜感激.谢谢.

解决方案

说明了空白的方法:

SQL Server遵循ANSI/ISO SQL-92规范(第8.2节, ,关于如何比较字符串的通用规则#3) 与空格. ANSI标准要求字符填充 比较中使用的字符串,以便它们的长度匹配之前 比较它们.填充直接影响WHERE的语义 和HAVING子句谓词以及其他Transact-SQL字符串 比较.例如,Transact-SQL认为字符串'abc'和 对于大多数比较操作,"abc"是等效的.

此规则的唯一例外是LIKE谓词.何时正确 LIKE谓词表达式的一侧具有带尾随的值 空间,SQL Server不会将两个值填充到相同的长度 在进行比较之前.因为LIKE的目的 根据定义,谓词是为了促进模式搜索,而不是 比简单的字符串相等测试,这不违反本节 前面提到的ANSI SQL-92规范.

这是上述所有情况的一个众所周知的例子:

DECLARE @a VARCHAR(10)
DECLARE @b varchar(10)

SET @a = '1'
SET @b = '1 ' --with trailing blank

SELECT 1
WHERE 
    @a = @b 
AND @a NOT LIKE @b
AND @b LIKE @a

以下是有关尾随空格和LIKE子句.

关于索引:

如果您提供的值与现有值之间存在差异,则插入到其值必须唯一的列中将失败 仅尾随空格.以下字符串都将被考虑 等价于唯一约束,主键或唯一索引. 同样,如果您现有一个包含以下数据的表,并尝试执行以下操作: 添加一个唯一的限制,它将失败,因为这些值是 被认为是相同的.

PaddedColumn
------------
'abc'
'abc '
'abc  '
'abc    '

(取自此处.)

I'm using Microsoft SQL Server 2008 R2 (with latest service pack/patches) and the database collation is SQL_Latin1_General_CP1_CI_AS.

The following code:

SET ANSI_PADDING ON;
GO

CREATE TABLE Test (
   Code VARCHAR(16) NULL
);
CREATE UNIQUE INDEX UniqueIndex
    ON Test(Code);

INSERT INTO Test VALUES ('sample');
INSERT INTO Test VALUES ('sample ');

SELECT '>' + Code + '<' FROM Test WHERE Code = 'sample        ';
GO

produces the following results:

(1 row(s) affected)

Msg 2601, Level 14, State 1, Line 8

Cannot insert duplicate key row in object 'dbo.Test' with unique index 'UniqueIndex'. The duplicate key value is (sample ).

The statement has been terminated.

‐‐‐‐‐‐‐‐‐‐‐‐

>sample<

(1 row(s) affected)

My questions are:

  1. I assume the index cannot store trailing spaces. Can anyone point me to official documentation that specifies/defines this behavior?
  2. Is there a setting to change this behavior, that is, make it recognize 'sample' and 'sample ' as two different values (which they are, by the way) so both can be in the index.
  3. Why on Earth is the SELECT returning a row? SQL Server must be doing something really funny/clever with the spaces in the WHERE clause because if I remove the uniqueness in the index, both INSERTs will run OK and the SELECT will return two rows!

Any help/pointer in the right direction would be appreciated. Thanks.

解决方案

Trailing blanks explained:

SQL Server follows the ANSI/ISO SQL-92 specification (Section 8.2, , General rules #3) on how to compare strings with spaces. The ANSI standard requires padding for the character strings used in comparisons so that their lengths match before comparing them. The padding directly affects the semantics of WHERE and HAVING clause predicates and other Transact-SQL string comparisons. For example, Transact-SQL considers the strings 'abc' and 'abc ' to be equivalent for most comparison operations.

The only exception to this rule is the LIKE predicate. When the right side of a LIKE predicate expression features a value with a trailing space, SQL Server does not pad the two values to the same length before the comparison occurs. Because the purpose of the LIKE predicate, by definition, is to facilitate pattern searches rather than simple string equality tests, this does not violate the section of the ANSI SQL-92 specification mentioned earlier.

Here's a well known example of all the cases mentioned above:

DECLARE @a VARCHAR(10)
DECLARE @b varchar(10)

SET @a = '1'
SET @b = '1 ' --with trailing blank

SELECT 1
WHERE 
    @a = @b 
AND @a NOT LIKE @b
AND @b LIKE @a

Here's some more detail about trailing blanks and the LIKE clause.

Regarding indexes:

An insertion into a column whose values must be unique will fail if you supply a value that is differentiated from existing values by trailing spaces only. The following strings will all be considered equivalent by a unique constraint, primary key, or unique index. Likewise, if you have an existing table with the data below and try to add a unique restriction, it will fail because the values are considered identical.

PaddedColumn
------------
'abc'
'abc '
'abc  '
'abc    '

(Taken from here.)

这篇关于唯一索引,varchar列和(空白)空间的行为的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆