nvarchar 连接/索引/nvarchar(max) 莫名其妙的行为 [英] nvarchar concatenation / index / nvarchar(max) inexplicable behavior

查看:49
本文介绍了nvarchar 连接/索引/nvarchar(max) 莫名其妙的行为的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我今天在 SQL Server(2008R2 和 2012)中遇到了一个非常奇怪的问题.我正在尝试结合使用连接和 select 语句来构建一个字符串.

I today ran into a really weird problem in SQL Server (both 2008R2 and 2012). I'm trying to build up a string using concatenation in combination with a select statement.

我找到了解决方法,但我真的很想了解这里发生了什么以及为什么它没有给我预期的结果.有人可以给我解释一下吗?

I have found workarounds, but I would really like to understand what's going on here and why it doesn't give me my expected result. Can someone explain it to me?

http://sqlfiddle.com/#!6/7438a/1

根据要求,还有这里的代码:

On request, also the code here:

-- base table
create table bla (
    [id] int identity(1,1) primary key,
    [priority] int,
    [msg] nvarchar(max),
    [autofix] bit
)

-- table without primary key on id column
create table bla2 (
    [id] int identity(1,1),
    [priority] int,
    [msg] nvarchar(max),
    [autofix] bit
)

-- table with nvarchar(1000) instead of max
create table bla3 (
    [id] int identity(1,1) primary key,
    [priority] int,
    [msg] nvarchar(1000),
    [autofix] bit
)

-- fill the three tables with the same values
insert into bla ([priority], [msg], [autofix])
values (1, 'A', 0),
       (2, 'B', 0)

insert into bla2 ([priority], [msg], [autofix])
values (1, 'A', 0),
       (2, 'B', 0)

insert into bla3 ([priority], [msg], [autofix])
values (1, 'A', 0),
       (2, 'B', 0)
;
declare @a nvarchar(max) = ''
declare @b nvarchar(max) = ''
declare @c nvarchar(max) = ''
declare @d nvarchar(max) = ''
declare @e nvarchar(max) = ''
declare @f nvarchar(max) = ''

-- I expect this to work and generate 'AB', but it doesn't
select @a = @a + [msg]
    from bla
    where   autofix = 0
    order by [priority] asc

-- this DOES work: convert nvarchar(4000)
select @b = @b + convert(nvarchar(4000),[msg])
    from bla
    where   autofix = 0
    order by [priority] asc

-- this DOES work: without WHERE clause
select @c = @c + [msg]
    from bla
    --where autofix = 0
    order by [priority] asc

-- this DOES work: without the order by
select @d = @d + [msg]
    from bla
    where   autofix = 0
    --order by [priority] asc

-- this DOES work: from bla2, so without the primary key on id
select @e = @e + [msg]
    from bla2
    where   autofix = 0
    order by [priority] asc

-- this DOES work: from bla3, so with msg nvarchar(1000) instead of nvarchar(max)
select @f = @f + [msg]
    from bla3
    where   autofix = 0
    order by [priority] asc

select @a as a, @b as b, @c as c, @d as d, @e as e, @f as f

推荐答案

知识库文章已由 VanDerNorth 链接,确实包含该行

The KB article already linked by VanDerNorth does include the line

聚合串联查询的正确行为是未定义.

The correct behavior for an aggregate concatenation query is undefined.

但随后通过提供一种似乎表明确定性行为是可能的变通方法,继续使水变得浑浊.

but then goes on to muddy the waters a bit by providing a workaround that does seem to indicate deterministic behavior is possible.

为了从聚合中达到预期的结果串联查询,将任何 Transact-SQL 函数或表达式应用于SELECT 列表中的列而不是 ORDER BY 子句中的列.

In order to achieve the expected results from an aggregate concatenation query, apply any Transact-SQL function or expression to the columns in the SELECT list rather than in the ORDER BY clause.

您的问题查询未将任何表达式应用于 ORDER BY 子句中的列.

Your problematic query does not apply any expressions to columns in the ORDER BY clause.

2005 年的文章 订购保证在 SQL Server 中... 做状态

The 2005 article Ordering guarantees in SQL Server... does state

出于向后兼容的原因,SQL Server 提供了对SELECT @p = @p + 1 ... ORDER BY 类型的赋值位于最顶层范围.

For backwards compatibility reasons, SQL Server provides support for assignments of type SELECT @p = @p + 1 ... ORDER BY at the top-most scope.

在连接按预期工作的计划中,具有表达式 [Expr1003] = Scalar Operator([@x]+[Expr1004]) 的计算标量出现在排序上方.

In the plans where the concatenation works as you expected the compute scalar with the expression [Expr1003] = Scalar Operator([@x]+[Expr1004]) appears above the sort.

在它无法工作的计划中,计算标量出现在排序下方.如 这个连接项来自 2006 年,当表达式 @x = @x + [msg] 出现在它被评估的排序下方时对于每一行,但所有评估最终都使用 @x 的预分配值.在 另一个类似的 Connect Item 来自 2006 年 Microsoft 的回应谈到修复"该问题.

In the plan where it fails to work the compute scalar appears below the sort. As explained in this connect item from 2006 when the expression @x = @x + [msg] appears below the sort it is evaluated for each row but all the evaluations end up using the pre assignment value of @x. In another similar Connect Item from 2006 the response from Microsoft spoke of "fixing" the issue.

Microsoft 对有关此问题的所有后续 Connect 项目的回应(并且有很多)指出,这根本无法保证

The Microsoft Response on all the later Connect items on this issue (and there are many) state that this is simply not guaranteed

示例 1

我们不对连接的正确性做任何保证查询(例如在数据检索中使用变量赋值具体顺序).SQL Server 2008 中的查询输出可能会发生变化取决于计划选择、表格中的数据等.你不应该即使语法允许您始终依赖此工作编写一个 SELECT 语句,将有序行检索与变量赋值.

we do not make any guarantees on the correctness of concatenation queries (like using variable assignments with data retrieval in a specific order). The query output can change in SQL Server 2008 depending on the plan choice, data in the tables etc. You shouldn't rely on this working consistently even though the syntax allows you to write a SELECT statement that mixes ordered rows retrieval with variable assignment.

示例 2

您所看到的行为是有意为之.使用赋值操作(本例中的串联)在带有 ORDER BY 子句的查询中具有未定义的行为.这可以从发布到发布甚至由于查询计划的变化而在特定服务器版本中.即使有解决方法,您也不能依赖此行为.看有关更多详细信息,请参阅以下知识库文章:
http://support.microsoft.com/kb/287515 唯一保证机制如下:

The behavior you are seeing is by design. Using assignment operations (concatenation in this example) in queries with ORDER BY clause has undefined behavior. This can change from release to release or even within a particular server version due to changes in the query plan. You cannot rely on this behavior even if there are workarounds. See the below KB article for more details:
http://support.microsoft.com/kb/287515 The ONLY guaranteed mechanism are the following:

  1. 使用游标按特定顺序遍历行并连接值
  2. 用于带有 ORDER BY 的 xml 查询以生成连接值
  3. 使用 CLR 聚合(这不适用于 ORDER BY 子句)

示例 3

您所看到的行为实际上是有意为之.这与SQL 是一种集合操作语言.SELECT 中的所有表达式列表(这也包括分配)不能保证是对每个输出行只执行一次.其实SQL查询优化器尝试尽可能少地执行它们.这个当您计算的值时将给出预期的结果变量基于表中的某些数据,但是当您正在分配取决于同一变量的先前值,结果可能出乎意料.如果查询优化器移动表达式到查询树中的不同位置,它可能会得到评估次数更少(或仅评估一次,如您的示例之一).这个这就是为什么我们不建议使用迭代"类型分配来计算聚合值.我们发现基于 XML 的变通方法……通常适用于客户

The behavior you are seeing is actually by design. This has to do with SQL being a set-manipulation language. All expressions in the SELECT list (and this includes assignments too) are not guaranteed to be executed exactly once for each output row. In fact, SQL query optimizer tries hard to execute them as few times as possible. This will give expected results when you are computing the value of the variable based on some data in the tables, but when the value that you are assigning depends on the previous value of the same variable, the results may be quite unexpected. If the query optimizer moves the expression to a different place in the query tree, it may get evaluated less times (or just once, as in one of your examples). This is why we don't recommend using the "iteration" type assignments to compute aggregate values. We find that XML-based workarounds ... usually work well for the customers

示例 4

即使没有 ORDER BY,我们也不保证 @var = @var +将为任何语句生成连接值影响多行.表达式的右侧可以在查询执行期间评估一次或多次我所说的行为取决于计划.

Even without ORDER BY, we do not guarantee that @var = @var + will produce the concatenated value for any statement that affects multiple rows. The right-hand side of the expression can be evaluated either once or multiple times during query execution and the behavior as I said is plan dependent.

示例5

SELECT 语句的变量赋值是专有语法(仅限 T-SQL)其中行为未定义或计划相关,如果生成多行.如果您需要进行字符串连接然后使用 SQLCLR 聚合或基于 FOR XML 查询的连接或其他关系方法.

The variable assignment with SELECT statement is a proprietary syntax (T-SQL only) where the behavior is undefined or plan dependent if multiple rows are produced. If you need to do the string concatenation then use a SQLCLR aggregate or FOR XML query based concatenation or other relational methods.

这篇关于nvarchar 连接/索引/nvarchar(max) 莫名其妙的行为的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆