将 SQL 串联与 ORDER BY 结合使用 [英] Using SQL concatenation with ORDER BY

查看:27
本文介绍了将 SQL 串联与 ORDER BY 结合使用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我很困惑.您如何用 ORDER BY 解释变量连接中的这种差异?

I'm confused. How could you explain this diffenece in variable concatenation with ORDER BY?

declare @tbl table (id int);
insert into @tbl values (1), (2), (3);

declare @msg1 varchar(100) = '', @msg2 varchar(100) = '',
    @msg3 varchar(100) = '',    @msg4 varchar(100) = '';

select @msg1 = @msg1 + cast(id as varchar) from @tbl
order by id;

select @msg2 = @msg2 + cast(id as varchar) from @tbl
order by id+id;

select @msg3 = @msg3 + cast(id as varchar) from @tbl
order by id+id desc;

select TOP(100) @msg4 = @msg4 + cast(id as varchar) from @tbl
order by id+id;

select
    @msg1 as msg1,
    @msg2 as msg2,
    @msg3 as msg3,
    @msg4 as msg4;

结果

msg1  msg2  msg3  msg4
----  ----  ----  ----
123   3     1     123  

推荐答案

正如许多人所确认的,这不是将一列中的所有行连接成一个变量的正确方法 - 即使在某些情况下它确实有效".如果您想查看一些替代方案,请查看 这个博客.

As many have confirmed, this is not the right way to concatenate all the rows in a column into a variable - even though in some cases it does "work". If you want to see some alternatives, please check out this blog.

根据 MSDN(适用于 SQL Server 2008 到 2014 和Azure SQL 数据库),SELECT 不应用于分配局部变量.在备注中,它描述了当您使用 SELECT 时,它如何尝试表现.需要注意的有趣点:

According to MSDN (applies to SQL Server 2008 through 2014 and Azure SQL Database) , the SELECT should not be used to assign local variables. In the remarks, it describes how, when you do use the SELECT, it attempts to behave. The interesting points to note:

  • 虽然通常它应该只用于向变量返回单个值,但当表达式是列的名称时,它可以返回多个值.
  • 当表达式确实返回多个值时,会为变量分配最后一个返回的值.
  • 如果没有返回值,变量将保留其原始值(此处不直接相关,但值得注意).

这里的前两点是关键 - 连接恰好起作用,因为 SELECT @msg1 = @msg1 + cast(id as varchar) 本质上是 SELECT @msg1 += cast(id asvarchar),并且如语法说明,+= 是此表达式上可接受的复合赋值运算符.请注意,不应期望 VARCHAR 继续支持此操作并进行字符串连接 - 仅仅因为它在某些情况下碰巧工作并不意味着它可以用于生产代码.

The first two points here are key - concatenation happens to work because SELECT @msg1 = @msg1 + cast(id as varchar) is essentially SELECT @msg1 += cast(id as varchar), and as the syntax notes, += is an accepted compound assignment operator on this expression. Please note here that it should not be expected this operation to continue to be supported on VARCHAR and to do string concatenation - just because it happens to work in some situations doesn't mean it is ok for production code.

关于根本原因的底线是在 select 表达式上运行的 Compute Scalar 是使用原始 id 列还是使用 id 列的表达式.您可能找不到任何关于优化器为何会为每个查询选择特定计划的文档,但每个示例都强调了不同的用例,这些用例允许从列中评估 msg 值(因此返回和连接多行)或表达式(因此只有最后一列).

The bottom line as to the underlying reason is whether the Compute Scalar that runs on the select expression uses the original id column or an expression of the id column. You probably can't find any docs on why the optimizer might choose the specific plans for each query, but each example highlights different use cases that allow the msg value to be evaluated from the column (and therefore multiple rows being returned and concatenated) or expression (and therefore only the last column).

  1. @msg1 是123",因为 Compute Scalar(变量赋值的逐行计算)发生在 Sort 之后.这允许标量计算返回 id 列上的多个值,通过 += 复合运算符将它们连接起来.我怀疑是否有具体的文档为什么,但似乎优化器选择在标量计算之前进行排序,因为 order by 是一列而不是表达式.

  1. @msg1 is '123' because the Compute Scalar (the row-by-row evaluation of the variable assignment) occurs after the Sort. This allows the scalar computation to return multiple values on the id column concatenating them through the += compound operator. I doubt there is specific documentation why, but it appears the optimizer chose to do the sort before the scalar computation because the order by was a column and not an expression.

@msg2 是 '3' 因为 Compute Scalar 在排序之前完成,这使得每行中的 @msg2 只是 ('' + id) - 所以永远不会连接,只是id的值.同样,可能没有任何文档说明优化器为什么选择这个,但似乎因为 order by 是一个表达式,所以它可能需要在 order by 中执行 (id+id) 作为标量计算的一部分,然后才能排序.此时,您的原始列不再引用源列,而是已被表达式替换.因此,正如 MSDN 所述,您的第一列指向一个表达式,而不是一列,因此该行为将结果集的最后一个值分配给 SELECT 中的变量.由于您对 ASC 进行了排序,因此您在此处得到了3".

@msg2 is '3' because the Compute Scalar is done before the sort, which leaves the @msg2 in each row just being the ('' + id) - so never concatenated, just the value of the id. Again, probably not any documentation why the optimizer chose this, but it appears that since the order by was an expression, perhaps it needed to do the (id+id) in the order by as part of the scalar computation before it could sort. At this point, your original column is no longer referencing the source column, but it has been replaced by an expression. Therefore, as MSDN stated, your first column points to an expression, not a column, so the behavior assigns the last value of the result set to the variable in the SELECT. Since you sorted ASC, you get '3' here.

@msg3 为1",原因与示例 2 相同,但您订购了 DESC.同样,这成为计算中的表达式 - 而不是原始列,因此赋值获取 DESC 顺序的最后一个值,因此您得到1".

@msg3 is '1' for the same reason as example 2, except you ordered DESC. Again, this becomes an expression in the evaluation - not the original column, so therefore the assignment gets the last value of the DESC order, so you get '1'.

@msg4 再次为123",因为 TOP 操作强制对 ORDER BY 进行初始标量评估,以便它可以确定您的前 100 条记录.这与示例 2 和示例 3 不同,在示例 2 和示例 3 中,标量计算同时包含 order by 和 select 计算,这导致每个示例都是一个表达式而不是引用回原始列.示例 4 将 TOP 分隔 ORDER BY 和 SELECT 计算,因此在应用 SORT (TOP N SORT) 之后,它会对 SELECT 列进行标量计算,此时您仍在引用原始列(不是列的表达式),因此它返回多行,允许发生串联.

@msg4 is '123' again because the TOP operation forces an initial scalar evaluation of the ORDER BY so that it can determine your top 100 records. This is different than examples 2 and 3 in which the scalar computation contained both the order by and select computations which caused each example to be an expression and not refer back to the original column. Example 4 has the TOP separating the ORDER BY and SELECT computations, so after the SORT (TOP N SORT) is applied, it then does the scalar computation for the SELECT columns in which at this point you are still referencing the original column (not an expression of the column), and therefore it returns multiple rows allowing the concatenation to occur.

来源:

这篇关于将 SQL 串联与 ORDER BY 结合使用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆