奇异的性能问题:内联用户定义函数中的公用表表达式 [英] Bizarre performance issue: Common Table Expressions in inline User-Defined Function

本文介绍了奇异的性能问题:内联用户定义函数中的公用表表达式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对于SQL专家来说,这真是令人费解-谁能想到其中第一个功能运行良好而第二个功能运行缓慢的原因?



功能A-通常在〜5毫秒内完成

 创建功能dbo.GoodFunction 

@IDs UniqueIntTable ID $只读

返回表
作为返回值
选择p.ID,p.Node,p.Name,p.Level


选择不同区域a。祖先AS节点
从层次结构h
交叉应用dbo.GetAncestors(h.Node.GetAncestor(1))a
在哪里h.ID IN(从@IDs中选择值)
)np
内部联接层次p
ON p.Node = np.Node

功能B-运行非常慢-5分钟后我放弃了

 创建函数dbo.BadFunction 

@IDs UniqueIntTable只读

返回表
作为返回
祖先s_CTE AS

选择区域a。祖先AS节点
从层次结构c
交叉应用dbo.GetAncestors(c.Node.GetAncestor(1))a
在哪里c.ID IN(从@IDs中选择值)

SELECT p.ID,p.Node,p.Name,p.Level
FROM Ancestors_CTE ac
INNER JOIN层次结构p
ON p.Node = ac.Node

我将在下面解释此功能确实可以,但是在开始之前,我想指出一点并不重要,因为据我所知,这两个函数是完全相同的!一个使用CTE,一个使用子查询;



如果有人认为这很重要,则此功能的目的是:只是挑选出层次结构中任意数量位置的所有可能的祖先(父母,祖父母等)。 Node 列是 hierarchyid dbo.GetAncestors 是一个CLR函数,它只是沿路径移动,它不执行任何数据访问。



UniqueIntTable 的含义是-它是用户定义的表类型,只有一列, Value int NOT NULL PRIMARY KEY 。这里应该建立索引的所有内容都已建立索引-函数A的执行计划本质上只是两个索引查找和一个哈希匹配,就像函数B一样。



有些对于这个奇怪的问题,甚至是陌生的方面:




  • 我什至无法使用以下方法获得简单查询的估计执行计划函数B。性能问题似乎与此简单函数的编译有关。


  • 如果我从函数B中取出 body,然后将其插入到内联查询中,它可以正常运行,与函数A的性能相同。因此,似乎仅在UDF内的CTE 存在问题,或者相反,仅对于使用CTE的UDF。


  • 在测试计算机上,一个内核上的CPU使用率一路飙升至100%我尝试运行B。似乎没有太多的I / O。




我只想耸耸肩将其作为SQL Server错误并使用版本A,但是我总是尽量记住规则#1( SELECT Ai n't Broken ),而且我担心函数A的良好结果在某种程度上是本地化的,

任何想法吗?






更新-我现在包括一个完整的自包含脚本以进行再现。



GetAncestors函数

  [SqlFunction(FillRowMethodName = FillAncestor,
TableDefinition =祖先architectureid ,IsDeterministic = true,
IsPrecise = true,DataAccess = DataAccessKind.None)]
公共静态IEnumerable GetAncestors(SqlHierarchyId h)
{
而(!h.IsNull)
{
收益率h;
h = h.GetAncestor(1);
}
}

模式创建

  BEGIN TRAN 

创建表层次结构

ID int NOT NULL ID( 1,1)
约束PK_Hierarchy主键,
节点architectureid非空,
[Level]作为Node.GetLevel(),
名称varchar(50)非空


创建索引IX_Hierarchy_Node
在层次结构上(节点)
包含(名称)

创建索引IX_Hierarchy_NodeBF
在层次结构上[[级别],节点)

GO

插入层次结构(节点,名称)
SELECT CAST('/ 1 /'ASarchitectureid),'Alice'UNION ALL
SELECT CAST('/ 1/1 /'AS层级id),'Bob'UNION ALL
SELECT CAST('/ 1/1/1 /'AS层级id),'Charles'UNION ALL
SELECT CAST('/ 1/1/2 /'AS层级id),'戴夫'UNION ALL
SELECT CAST('/ 1/1/3 /'AS层级id),'艾伦'UNION ALL
SELECT CAST('/ 1/2 /'AS boundaryid),'Fred'UNION ALL
SELECT CAST('/ 1/3 /'AS层级id),'Graham'联盟所有
SELECT CAST('/ 1/3/1 /'AS层级id),'Harold'联盟所有
SELECT CAST( '/ 1/3/2 /'AS层级ID),'Isabelle'联盟所有
SELECT CAST('/ 1/4 /'AS层级ID),'John'联盟所有
SELECT CAST('/ 2 /'AS层级id),'Karen'联盟所有
SELECT CAST('/ 2/1 /'AS层级id),'Liam'联盟全部
SELECT CAST('/ 2/2 /'AS architectureid),'Mary'UNION ALL
SELECT CAST('/ 2/2/1 /'ASarchitectureid),'Nigel'UNION ALL
SELECT CAST('/ 2/2/2 /'AS等级,)'Oliver'联盟所有
SELECT CAST('/ 2/3 /'AS等级ID),'彼得'联盟所有
SELECT CAST('/ 2/3/1 /'AS等级ID) ,'Quinn'

GO

创建类型UniqueIntTable如表

Value int NOT NULL,
PRIMARY KEY(Value)


GO

COMMIT

GO

上面的代码/脚本可用于创建CLR函数/ DB模式;在原始文件中使用相同的 GoodFunction BadFunction 脚本。

解决方案

哈哈,试试看:

 如果OBJECT_ID('_ HappyFunction')不是NULL DROP FUNCTION _HappyFunction 
IF OBJECT_ID('_ SadFunction')NOT NULL DROP FUNCTION _SadFunction
TYPE_ID('_UniqueIntTable')不是NULL DROP TYPE _UniqueIntTable
GO

b创建类型_UniqueIntTable作为表(值int非空主键)
GO

创建函数_HappyFunction(@IDs _UniqueIntTable只读)
作为返回表
选择值FROM @IDs
GO

创建函数_SadFunction(@IDs _UniqueIntTable只读)
以CTE AS返回表
(从@IDs中选择值)
从CTE
GO

中选择值-这将返回一个空记录集
DECLARE @IDs _UniqueIntTable
SELECT * FROM _HappyFunction(@IDs)
GO

-这将挂
的DECLARE @IDs _UniqueIntTable
SELECT * FROM _SadFunction(@IDs)
GO

谁会猜到吗?


Here's a brain-twister for the SQL guys - can anyone think of a reason why the first of these functions performs fine, and the second one runs dog-slow?

Function A - Typically finishes in ~5 ms

CREATE FUNCTION dbo.GoodFunction
(
    @IDs UniqueIntTable READONLY
)
RETURNS TABLE
AS RETURN
    SELECT p.ID, p.Node, p.Name, p.Level
    FROM
    (
        SELECT DISTINCT a.Ancestor AS Node
        FROM Hierarchy h
        CROSS APPLY dbo.GetAncestors(h.Node.GetAncestor(1)) a
        WHERE h.ID IN (SELECT Value FROM @IDs)
    ) np
    INNER JOIN Hierarchy p
    ON p.Node = np.Node

Function B - Runs extremely slow - I gave up after 5 minutes

CREATE FUNCTION dbo.BadFunction
(
    @IDs UniqueIntTable READONLY
)
RETURNS TABLE
AS RETURN
    WITH Ancestors_CTE AS
    (
        SELECT DISTINCT a.Ancestor AS Node
        FROM Hierarchy c
        CROSS APPLY dbo.GetAncestors(c.Node.GetAncestor(1)) a
        WHERE c.ID IN (SELECT Value FROM @IDs)
    )
    SELECT p.ID, p.Node, p.Name, p.Level
    FROM Ancestors_CTE ac
    INNER JOIN Hierarchy p
    ON p.Node = ac.Node

I'll explain below what this function does, but before I get into that, I want to point out that I don't think it's important, because as far as I can tell, these two functions are exactly the same! The only difference is that one uses a CTE and one uses a subquery; the contents of the subquery in A and the CTE in B are identical.

In case anyone decides this matters: The purpose of this function is just to pick out all the possible ancestors (parent, grandparent, etc.) of an arbitrary number of locations in a hierarchy. The Node column is a hierarchyid, and dbo.GetAncestors is a CLR function that simply walks up the path, it does not do any data access.

UniqueIntTable is what it implies - it's a user-defined table type with one column, Value int NOT NULL PRIMARY KEY. Everything here that should be indexed is indexed - the execution plan of function A is essentially just two index seeks and a hash match, as it should be with function B.

Some even stranger aspects to this strange problem:

  • I'm not even able to get an estimated execution plan for a simple query using function B. It almost looks like the performance issue has something to do with the compilation of this simple-looking function.

  • If I take the "body" out of function B and just stick it into an inline query, it runs normally, same performance as function A. So it only seems to be a problem with a CTE inside a UDF, or conversely, only with a UDF that uses a CTE.

  • The CPU usage on one core on the test machine spikes all the way up to 100% when I try to run B. There doesn't seem to be much I/O.

I want to just shrug it off as a SQL Server bug and use version A, but I always try to keep Rule #1 ("SELECT Ain't Broken") in mind, and I'm concerned that the good results from function A are somehow a localized fluke, that it will "fail" the same way that B does on a different server.

Any ideas?


UPDATE - I'm now including a complete self-contained script to reproduce.

GetAncestors Function

[SqlFunction(FillRowMethodName = "FillAncestor", 
    TableDefinition = "Ancestor hierarchyid", IsDeterministic = true,
    IsPrecise = true, DataAccess = DataAccessKind.None)]
public static IEnumerable GetAncestors(SqlHierarchyId h)
{
    while (!h.IsNull)
    {
        yield return h;
        h = h.GetAncestor(1);
    }
}

Schema Creation

BEGIN TRAN

CREATE TABLE Hierarchy
(
    ID int NOT NULL IDENTITY(1, 1)
        CONSTRAINT PK_Hierarchy PRIMARY KEY CLUSTERED,
    Node hierarchyid NOT NULL,
    [Level] as Node.GetLevel(),
    Name varchar(50) NOT NULL
)

CREATE INDEX IX_Hierarchy_Node
ON Hierarchy (Node)
INCLUDE (Name)

CREATE INDEX IX_Hierarchy_NodeBF
ON Hierarchy ([Level], Node)

GO

INSERT Hierarchy (Node, Name)
    SELECT CAST('/1/' AS hierarchyid), 'Alice' UNION ALL
    SELECT CAST('/1/1/' AS hierarchyid), 'Bob' UNION ALL
    SELECT CAST('/1/1/1/' AS hierarchyid), 'Charles' UNION ALL
    SELECT CAST('/1/1/2/' AS hierarchyid), 'Dave' UNION ALL
    SELECT CAST('/1/1/3/' AS hierarchyid), 'Ellen' UNION ALL
    SELECT CAST('/1/2/' AS hierarchyid), 'Fred' UNION ALL
    SELECT CAST('/1/3/' AS hierarchyid), 'Graham' UNION ALL
    SELECT CAST('/1/3/1/' AS hierarchyid), 'Harold' UNION ALL
    SELECT CAST('/1/3/2/' AS hierarchyid), 'Isabelle' UNION ALL
    SELECT CAST('/1/4/' AS hierarchyid), 'John' UNION ALL
    SELECT CAST('/2/' AS hierarchyid), 'Karen' UNION ALL
    SELECT CAST('/2/1/' AS hierarchyid), 'Liam' UNION ALL
    SELECT CAST('/2/2/' AS hierarchyid), 'Mary' UNION ALL
    SELECT CAST('/2/2/1/' AS hierarchyid), 'Nigel' UNION ALL
    SELECT CAST('/2/2/2/' AS hierarchyid), 'Oliver' UNION ALL
    SELECT CAST('/2/3/' AS hierarchyid), 'Peter' UNION ALL
    SELECT CAST('/2/3/1/' AS hierarchyid), 'Quinn'

GO

CREATE TYPE UniqueIntTable AS TABLE 
(
    Value int NOT NULL,
    PRIMARY KEY (Value)
)

GO

COMMIT

GO

The above code/script can be used to create the CLR function/DB schema; use the same GoodFunction and BadFunction scripts in the original.

解决方案

Haha, try this:

IF OBJECT_ID('_HappyFunction' ) IS NOT NULL DROP FUNCTION _HappyFunction
IF OBJECT_ID('_SadFunction'   ) IS NOT NULL DROP FUNCTION _SadFunction
IF TYPE_ID  ('_UniqueIntTable') IS NOT NULL DROP TYPE _UniqueIntTable
GO

CREATE TYPE _UniqueIntTable AS TABLE (Value int NOT NULL PRIMARY KEY)
GO

CREATE FUNCTION _HappyFunction (@IDs _UniqueIntTable READONLY)
RETURNS TABLE AS RETURN
  SELECT Value FROM @IDs
GO

CREATE FUNCTION _SadFunction (@IDs _UniqueIntTable READONLY)
RETURNS TABLE AS RETURN 
  WITH CTE AS (SELECT Value FROM @IDs)
  SELECT Value FROM CTE
GO

-- this will return an empty record set
DECLARE @IDs _UniqueIntTable 
SELECT * FROM _HappyFunction(@IDs)
GO

-- this will hang
DECLARE @IDs _UniqueIntTable 
SELECT * FROM _SadFunction(@IDs)
GO

Who would have guessed?

这篇关于奇异的性能问题:内联用户定义函数中的公用表表达式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆