奇异的性能问题：内联用户定义函数中的公用表表达式 [英] Bizarre performance issue: Common Table Expressions in inline User-Defined Function

查看：59 发布时间：2020/10/6 7:13:09 sql-server performance sql-server-2008 user-defined-functions common-table-expression

本文介绍了奇异的性能问题：内联用户定义函数中的公用表表达式的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

对于SQL专家来说，这真是令人费解-谁能想到其中第一个功能运行良好而第二个功能运行缓慢的原因？

功能A-通常在〜5毫秒内完成

 创建功能dbo.GoodFunction 
（
 @IDs UniqueIntTable ID $只读
）
返回表
作为返回值
选择p.ID，p.Node，p.Name，p.Level 
从
（
选择不同区域a。祖先AS节点
从层次结构h 
交叉应用dbo.GetAncestors（h.Node.GetAncestor（1））a 
在哪里h.ID IN（从@IDs中选择值）
）np 
内部联接层次p 
 ON p.Node = np.Node

功能B-运行非常慢-5分钟后我放弃了

 创建函数dbo.BadFunction 
（
 @IDs UniqueIntTable只读
）
返回表
作为返回
祖先s_CTE AS 
（
选择区域a。祖先AS节点
从层次结构c 
交叉应用dbo.GetAncestors（c.Node.GetAncestor（1））a 
在哪里c.ID IN（从@IDs中选择值）
）
 SELECT p.ID，p.Node，p.Name，p.Level 
 FROM Ancestors_CTE ac 
 INNER JOIN层次结构p 
 ON p.Node = ac.Node

我将在下面解释此功能确实可以，但是在开始之前，我想指出一点并不重要，因为据我所知，这两个函数是完全相同的！一个使用CTE，一个使用子查询；

如果有人认为这很重要，则此功能的目的是：只是挑选出层次结构中任意数量位置的所有可能的祖先（父母，祖父母等）。 Node 列是 hierarchyid 和 dbo.GetAncestors 是一个CLR函数，它只是沿路径移动，它不执行任何数据访问。

UniqueIntTable 的含义是-它是用户定义的表类型，只有一列， Value int NOT NULL PRIMARY KEY 。这里应该建立索引的所有内容都已建立索引-函数A的执行计划本质上只是两个索引查找和一个哈希匹配，就像函数B一样。

有些对于这个奇怪的问题，甚至是陌生的方面：

我什至无法使用以下方法获得简单查询的估计执行计划函数B。性能问题似乎与此简单函数的编译有关。

如果我从函数B中取出 body，然后将其插入到内联查询中，它可以正常运行，与函数A的性能相同。因此，似乎仅在UDF内的CTE 存在问题，或者相反，仅对于使用CTE的UDF。

在测试计算机上，一个内核上的CPU使用率一路飙升至100％我尝试运行B。似乎没有太多的I / O。

我只想耸耸肩将其作为SQL Server错误并使用版本A，但是我总是尽量记住规则＃1（ SELECT Ai n't Broken ），而且我担心函数A的良好结果在某种程度上是本地化的，

任何想法吗？

更新-我现在包括一个完整的自包含脚本以进行再现。

GetAncestors函数

  [SqlFunction（FillRowMethodName = FillAncestor，
 TableDefinition =祖先architectureid ，IsDeterministic = true，
 IsPrecise = true，DataAccess = DataAccessKind.None）] 
公共静态IEnumerable GetAncestors（SqlHierarchyId h）
 {
而（！h.IsNull）
 {
收益率h; 
 h = h.GetAncestor（1）; 
} 
}

模式创建

  BEGIN TRAN 
 
创建表层次结构
（
 ID int NOT NULL ID（ 1，1）
约束PK_Hierarchy主键，
节点architectureid非空，
 [Level]作为Node.GetLevel（），
名称varchar（50）非空
）
 
创建索引IX_Hierarchy_Node 
在层次结构上（节点）
包含（名称）
 
创建索引IX_Hierarchy_NodeBF 
在层次结构上[[级别]，节点）
 
 GO 
 
插入层次结构（节点，名称）
 SELECT CAST（'/ 1 /'ASarchitectureid），'Alice'UNION ALL 
 SELECT CAST（'/ 1/1 /'AS层级id），'Bob'UNION ALL 
 SELECT CAST（'/ 1/1/1 /'AS层级id），'Charles'UNION ALL 
 SELECT CAST（'/ 1/1/2 /'AS层级id），'戴夫'UNION ALL 
 SELECT CAST（'/ 1/1/3 /'AS层级id），'艾伦'UNION ALL 
 SELECT CAST（'/ 1/2 /'AS boundaryid），'Fred'UNION ALL 
 SELECT CAST（'/ 1/3 /'AS层级id），'Graham'联盟所有
 SELECT CAST（'/ 1/3/1 /'AS层级id），'Harold'联盟所有
 SELECT CAST（ '/ 1/3/2 /'AS层级ID），'Isabelle'联盟所有
 SELECT CAST（'/ 1/4 /'AS层级ID），'John'联盟所有
 SELECT CAST（'/ 2 /'AS层级id），'Karen'联盟所有
 SELECT CAST（'/ 2/1 /'AS层级id），'Liam'联盟全部
 SELECT CAST（'/ 2/2 /'AS architectureid），'Mary'UNION ALL 
 SELECT CAST（'/ 2/2/1 /'ASarchitectureid），'Nigel'UNION ALL 
 SELECT CAST（'/ 2/2/2 /'AS等级，）'Oliver'联盟所有
 SELECT CAST（'/ 2/3 /'AS等级ID），'彼得'联盟所有
 SELECT CAST（'/ 2/3/1 /'AS等级ID） ，'Quinn'
 
 GO 
 
创建类型UniqueIntTable如表
（
 Value int NOT NULL，
 PRIMARY KEY（Value）
）
 
 GO 
 
 COMMIT 
 
 GO

上面的代码/脚本可用于创建CLR函数/ DB模式；在原始文件中使用相同的 GoodFunction 和 BadFunction 脚本。

解决方案

哈哈，试试看：

 如果OBJECT_ID（'_ HappyFunction'）不是NULL DROP FUNCTION _HappyFunction 
 IF OBJECT_ID（'_ SadFunction'）NOT NULL DROP FUNCTION _SadFunction 
 TYPE_ID（'_UniqueIntTable'）不是NULL DROP TYPE _UniqueIntTable 
 GO 
 
 b创建类型_UniqueIntTable作为表（值int非空主键）
 GO 
 
创建函数_HappyFunction（@IDs _UniqueIntTable只读）
作为返回表
选择值FROM @IDs 
 GO 
 
创建函数_SadFunction（@IDs _UniqueIntTable只读）
以CTE AS返回表
（从@IDs中选择值）
从CTE 
 GO 
 
中选择值-这将返回一个空记录集
 DECLARE @IDs _UniqueIntTable 
 SELECT * FROM _HappyFunction（@IDs）
 GO 
 
-这将挂
的DECLARE @IDs _UniqueIntTable 
 SELECT * FROM _SadFunction（@IDs）
 GO

谁会猜到吗？

Here's a brain-twister for the SQL guys - can anyone think of a reason why the first of these functions performs fine, and the second one runs dog-slow?

Function A - Typically finishes in ~5 ms

CREATE FUNCTION dbo.GoodFunction
(
    @IDs UniqueIntTable READONLY
)
RETURNS TABLE
AS RETURN
    SELECT p.ID, p.Node, p.Name, p.Level
    FROM
    (
        SELECT DISTINCT a.Ancestor AS Node
        FROM Hierarchy h
        CROSS APPLY dbo.GetAncestors(h.Node.GetAncestor(1)) a
        WHERE h.ID IN (SELECT Value FROM @IDs)
    ) np
    INNER JOIN Hierarchy p
    ON p.Node = np.Node

Function B - Runs extremely slow - I gave up after 5 minutes

CREATE FUNCTION dbo.BadFunction
(
    @IDs UniqueIntTable READONLY
)
RETURNS TABLE
AS RETURN
    WITH Ancestors_CTE AS
    (
        SELECT DISTINCT a.Ancestor AS Node
        FROM Hierarchy c
        CROSS APPLY dbo.GetAncestors(c.Node.GetAncestor(1)) a
        WHERE c.ID IN (SELECT Value FROM @IDs)
    )
    SELECT p.ID, p.Node, p.Name, p.Level
    FROM Ancestors_CTE ac
    INNER JOIN Hierarchy p
    ON p.Node = ac.Node

I'll explain below what this function does, but before I get into that, I want to point out that I don't think it's important, because as far as I can tell, these two functions are exactly the same! The only difference is that one uses a CTE and one uses a subquery; the contents of the subquery in A and the CTE in B are identical.

In case anyone decides this matters: The purpose of this function is just to pick out all the possible ancestors (parent, grandparent, etc.) of an arbitrary number of locations in a hierarchy. The Node column is a hierarchyid, and dbo.GetAncestors is a CLR function that simply walks up the path, it does not do any data access.

UniqueIntTable is what it implies - it's a user-defined table type with one column, Value int NOT NULL PRIMARY KEY. Everything here that should be indexed is indexed - the execution plan of function A is essentially just two index seeks and a hash match, as it should be with function B.

Some even stranger aspects to this strange problem:

I'm not even able to get an estimated execution plan for a simple query using function B. It almost looks like the performance issue has something to do with the compilation of this simple-looking function.
If I take the "body" out of function B and just stick it into an inline query, it runs normally, same performance as function A. So it only seems to be a problem with a CTE inside a UDF, or conversely, only with a UDF that uses a CTE.
The CPU usage on one core on the test machine spikes all the way up to 100% when I try to run B. There doesn't seem to be much I/O.

I want to just shrug it off as a SQL Server bug and use version A, but I always try to keep Rule #1 ("SELECT Ain't Broken") in mind, and I'm concerned that the good results from function A are somehow a localized fluke, that it will "fail" the same way that B does on a different server.

Any ideas?

UPDATE - I'm now including a complete self-contained script to reproduce.

GetAncestors Function

[SqlFunction(FillRowMethodName = "FillAncestor", 
    TableDefinition = "Ancestor hierarchyid", IsDeterministic = true,
    IsPrecise = true, DataAccess = DataAccessKind.None)]
public static IEnumerable GetAncestors(SqlHierarchyId h)
{
    while (!h.IsNull)
    {
        yield return h;
        h = h.GetAncestor(1);
    }
}

Schema Creation

BEGIN TRAN

CREATE TABLE Hierarchy
(
    ID int NOT NULL IDENTITY(1, 1)
        CONSTRAINT PK_Hierarchy PRIMARY KEY CLUSTERED,
    Node hierarchyid NOT NULL,
    [Level] as Node.GetLevel(),
    Name varchar(50) NOT NULL
)

CREATE INDEX IX_Hierarchy_Node
ON Hierarchy (Node)
INCLUDE (Name)

CREATE INDEX IX_Hierarchy_NodeBF
ON Hierarchy ([Level], Node)

GO

INSERT Hierarchy (Node, Name)
    SELECT CAST('/1/' AS hierarchyid), 'Alice' UNION ALL
    SELECT CAST('/1/1/' AS hierarchyid), 'Bob' UNION ALL
    SELECT CAST('/1/1/1/' AS hierarchyid), 'Charles' UNION ALL
    SELECT CAST('/1/1/2/' AS hierarchyid), 'Dave' UNION ALL
    SELECT CAST('/1/1/3/' AS hierarchyid), 'Ellen' UNION ALL
    SELECT CAST('/1/2/' AS hierarchyid), 'Fred' UNION ALL
    SELECT CAST('/1/3/' AS hierarchyid), 'Graham' UNION ALL
    SELECT CAST('/1/3/1/' AS hierarchyid), 'Harold' UNION ALL
    SELECT CAST('/1/3/2/' AS hierarchyid), 'Isabelle' UNION ALL
    SELECT CAST('/1/4/' AS hierarchyid), 'John' UNION ALL
    SELECT CAST('/2/' AS hierarchyid), 'Karen' UNION ALL
    SELECT CAST('/2/1/' AS hierarchyid), 'Liam' UNION ALL
    SELECT CAST('/2/2/' AS hierarchyid), 'Mary' UNION ALL
    SELECT CAST('/2/2/1/' AS hierarchyid), 'Nigel' UNION ALL
    SELECT CAST('/2/2/2/' AS hierarchyid), 'Oliver' UNION ALL
    SELECT CAST('/2/3/' AS hierarchyid), 'Peter' UNION ALL
    SELECT CAST('/2/3/1/' AS hierarchyid), 'Quinn'

GO

CREATE TYPE UniqueIntTable AS TABLE 
(
    Value int NOT NULL,
    PRIMARY KEY (Value)
)

GO

COMMIT

GO

The above code/script can be used to create the CLR function/DB schema; use the same GoodFunction and BadFunction scripts in the original.

解决方案

Haha, try this:

IF OBJECT_ID('_HappyFunction' ) IS NOT NULL DROP FUNCTION _HappyFunction
IF OBJECT_ID('_SadFunction'   ) IS NOT NULL DROP FUNCTION _SadFunction
IF TYPE_ID  ('_UniqueIntTable') IS NOT NULL DROP TYPE _UniqueIntTable
GO

CREATE TYPE _UniqueIntTable AS TABLE (Value int NOT NULL PRIMARY KEY)
GO

CREATE FUNCTION _HappyFunction (@IDs _UniqueIntTable READONLY)
RETURNS TABLE AS RETURN
  SELECT Value FROM @IDs
GO

CREATE FUNCTION _SadFunction (@IDs _UniqueIntTable READONLY)
RETURNS TABLE AS RETURN 
  WITH CTE AS (SELECT Value FROM @IDs)
  SELECT Value FROM CTE
GO

-- this will return an empty record set
DECLARE @IDs _UniqueIntTable 
SELECT * FROM _HappyFunction(@IDs)
GO

-- this will hang
DECLARE @IDs _UniqueIntTable 
SELECT * FROM _SadFunction(@IDs)
GO

Who would have guessed?

这篇关于奇异的性能问题：内联用户定义函数中的公用表表达式的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

奇异的性能问题：内联用户定义函数中的公用表表达式 [英] Bizarre performance issue: Common Table Expressions in inline User-Defined Function

问题描述

相关文章

数据库最新文章

热门教程

热门工具

登录关闭

奇异的性能问题：内联用户定义函数中的公用表表达式 [英] Bizarre performance issue: Common Table Expressions in inline User-Defined Function

问题描述

相关文章

数据库最新文章

热门教程

热门工具

登录 关闭

登录关闭