奇异的性能问题:内联用户定义函数中的公用表表达式 [英] Bizarre performance issue: Common Table Expressions in inline User-Defined Function
问题描述
对于SQL专家来说,这真是令人费解-谁能想到其中第一个功能运行良好而第二个功能运行缓慢的原因?
功能A-通常在〜5毫秒内完成
创建功能dbo.GoodFunction
(
@IDs UniqueIntTable ID $只读
)
返回表
作为返回值
选择p.ID,p.Node,p.Name,p.Level
从
(
选择不同区域a。祖先AS节点
从层次结构h
交叉应用dbo.GetAncestors(h.Node.GetAncestor(1))a
在哪里h.ID IN(从@IDs中选择值)
)np
内部联接层次p
ON p.Node = np.Node
功能B-运行非常慢-5分钟后我放弃了
创建函数dbo.BadFunction
(
@IDs UniqueIntTable只读
)
返回表
作为返回
祖先s_CTE AS
(
选择区域a。祖先AS节点
从层次结构c
交叉应用dbo.GetAncestors(c.Node.GetAncestor(1))a
在哪里c.ID IN(从@IDs中选择值)
)
SELECT p.ID,p.Node,p.Name,p.Level
FROM Ancestors_CTE ac
INNER JOIN层次结构p
ON p.Node = ac.Node
我将在下面解释此功能确实可以,但是在开始之前,我想指出一点并不重要,因为据我所知,这两个函数是完全相同的!一个使用CTE,一个使用子查询;
如果有人认为这很重要,则此功能的目的是:只是挑选出层次结构中任意数量位置的所有可能的祖先(父母,祖父母等)。 Node
列是 hierarchyid
和 dbo.GetAncestors
是一个CLR函数,它只是沿路径移动,它不执行任何数据访问。
UniqueIntTable
的含义是-它是用户定义的表类型,只有一列, Value int NOT NULL PRIMARY KEY
。这里应该建立索引的所有内容都已建立索引-函数A的执行计划本质上只是两个索引查找和一个哈希匹配,就像函数B一样。
有些对于这个奇怪的问题,甚至是陌生的方面:
-
我什至无法使用以下方法获得简单查询的估计执行计划函数B。性能问题似乎与此简单函数的编译有关。
-
如果我从函数B中取出 body,然后将其插入到内联查询中,它可以正常运行,与函数A的性能相同。因此,似乎仅在UDF内的CTE 存在问题,或者相反,仅对于使用CTE的UDF。
-
在测试计算机上,一个内核上的CPU使用率一路飙升至100%我尝试运行B。似乎没有太多的I / O。
我只想耸耸肩将其作为SQL Server错误并使用版本A,但是我总是尽量记住规则#1( SELECT Ai n't Broken ),而且我担心函数A的良好结果在某种程度上是本地化的,
任何想法吗?
更新-我现在包括一个完整的自包含脚本以进行再现。
GetAncestors函数
[SqlFunction(FillRowMethodName = FillAncestor,
TableDefinition =祖先architectureid ,IsDeterministic = true,
IsPrecise = true,DataAccess = DataAccessKind.None)]
公共静态IEnumerable GetAncestors(SqlHierarchyId h)
{
而(!h.IsNull)
{
收益率h;
h = h.GetAncestor(1);
}
}
模式创建
BEGIN TRAN
创建表层次结构
(
ID int NOT NULL ID( 1,1)
约束PK_Hierarchy主键,
节点architectureid非空,
[Level]作为Node.GetLevel(),
名称varchar(50)非空
)
创建索引IX_Hierarchy_Node
在层次结构上(节点)
包含(名称)
创建索引IX_Hierarchy_NodeBF
在层次结构上[[级别],节点)
GO
插入层次结构(节点,名称)
SELECT CAST('/ 1 /'ASarchitectureid),'Alice'UNION ALL
SELECT CAST('/ 1/1 /'AS层级id),'Bob'UNION ALL
SELECT CAST('/ 1/1/1 /'AS层级id),'Charles'UNION ALL
SELECT CAST('/ 1/1/2 /'AS层级id),'戴夫'UNION ALL
SELECT CAST('/ 1/1/3 /'AS层级id),'艾伦'UNION ALL
SELECT CAST('/ 1/2 /'AS boundaryid),'Fred'UNION ALL
SELECT CAST('/ 1/3 /'AS层级id),'Graham'联盟所有
SELECT CAST('/ 1/3/1 /'AS层级id),'Harold'联盟所有
SELECT CAST( '/ 1/3/2 /'AS层级ID),'Isabelle'联盟所有
SELECT CAST('/ 1/4 /'AS层级ID),'John'联盟所有
SELECT CAST('/ 2 /'AS层级id),'Karen'联盟所有
SELECT CAST('/ 2/1 /'AS层级id),'Liam'联盟全部
SELECT CAST('/ 2/2 /'AS architectureid),'Mary'UNION ALL
SELECT CAST('/ 2/2/1 /'ASarchitectureid),'Nigel'UNION ALL
SELECT CAST('/ 2/2/2 /'AS等级,)'Oliver'联盟所有
SELECT CAST('/ 2/3 /'AS等级ID),'彼得'联盟所有
SELECT CAST('/ 2/3/1 /'AS等级ID) ,'Quinn'
GO
创建类型UniqueIntTable如表
(
Value int NOT NULL,
PRIMARY KEY(Value)
)
GO
COMMIT
GO
上面的代码/脚本可用于创建CLR函数/ DB模式;在原始文件中使用相同的 GoodFunction
和 BadFunction
脚本。
哈哈,试试看:
如果OBJECT_ID('_ HappyFunction')不是NULL DROP FUNCTION _HappyFunction
IF OBJECT_ID('_ SadFunction')NOT NULL DROP FUNCTION _SadFunction
TYPE_ID('_UniqueIntTable')不是NULL DROP TYPE _UniqueIntTable
GO
b创建类型_UniqueIntTable作为表(值int非空主键)
GO
创建函数_HappyFunction(@IDs _UniqueIntTable只读)
作为返回表
选择值FROM @IDs
GO
创建函数_SadFunction(@IDs _UniqueIntTable只读)
以CTE AS返回表
(从@IDs中选择值)
从CTE
GO
中选择值-这将返回一个空记录集
DECLARE @IDs _UniqueIntTable
SELECT * FROM _HappyFunction(@IDs)
GO
-这将挂
的DECLARE @IDs _UniqueIntTable
SELECT * FROM _SadFunction(@IDs)
GO
谁会猜到吗?
Here's a brain-twister for the SQL guys - can anyone think of a reason why the first of these functions performs fine, and the second one runs dog-slow?
Function A - Typically finishes in ~5 ms
CREATE FUNCTION dbo.GoodFunction
(
@IDs UniqueIntTable READONLY
)
RETURNS TABLE
AS RETURN
SELECT p.ID, p.Node, p.Name, p.Level
FROM
(
SELECT DISTINCT a.Ancestor AS Node
FROM Hierarchy h
CROSS APPLY dbo.GetAncestors(h.Node.GetAncestor(1)) a
WHERE h.ID IN (SELECT Value FROM @IDs)
) np
INNER JOIN Hierarchy p
ON p.Node = np.Node
Function B - Runs extremely slow - I gave up after 5 minutes
CREATE FUNCTION dbo.BadFunction
(
@IDs UniqueIntTable READONLY
)
RETURNS TABLE
AS RETURN
WITH Ancestors_CTE AS
(
SELECT DISTINCT a.Ancestor AS Node
FROM Hierarchy c
CROSS APPLY dbo.GetAncestors(c.Node.GetAncestor(1)) a
WHERE c.ID IN (SELECT Value FROM @IDs)
)
SELECT p.ID, p.Node, p.Name, p.Level
FROM Ancestors_CTE ac
INNER JOIN Hierarchy p
ON p.Node = ac.Node
I'll explain below what this function does, but before I get into that, I want to point out that I don't think it's important, because as far as I can tell, these two functions are exactly the same! The only difference is that one uses a CTE and one uses a subquery; the contents of the subquery in A and the CTE in B are identical.
In case anyone decides this matters: The purpose of this function is just to pick out all the possible ancestors (parent, grandparent, etc.) of an arbitrary number of locations in a hierarchy. The Node
column is a hierarchyid
, and dbo.GetAncestors
is a CLR function that simply walks up the path, it does not do any data access.
UniqueIntTable
is what it implies - it's a user-defined table type with one column, Value int NOT NULL PRIMARY KEY
. Everything here that should be indexed is indexed - the execution plan of function A is essentially just two index seeks and a hash match, as it should be with function B.
Some even stranger aspects to this strange problem:
I'm not even able to get an estimated execution plan for a simple query using function B. It almost looks like the performance issue has something to do with the compilation of this simple-looking function.
If I take the "body" out of function B and just stick it into an inline query, it runs normally, same performance as function A. So it only seems to be a problem with a CTE inside a UDF, or conversely, only with a UDF that uses a CTE.
The CPU usage on one core on the test machine spikes all the way up to 100% when I try to run B. There doesn't seem to be much I/O.
I want to just shrug it off as a SQL Server bug and use version A, but I always try to keep Rule #1 ("SELECT Ain't Broken") in mind, and I'm concerned that the good results from function A are somehow a localized fluke, that it will "fail" the same way that B does on a different server.
Any ideas?
UPDATE - I'm now including a complete self-contained script to reproduce.
GetAncestors Function
[SqlFunction(FillRowMethodName = "FillAncestor",
TableDefinition = "Ancestor hierarchyid", IsDeterministic = true,
IsPrecise = true, DataAccess = DataAccessKind.None)]
public static IEnumerable GetAncestors(SqlHierarchyId h)
{
while (!h.IsNull)
{
yield return h;
h = h.GetAncestor(1);
}
}
Schema Creation
BEGIN TRAN
CREATE TABLE Hierarchy
(
ID int NOT NULL IDENTITY(1, 1)
CONSTRAINT PK_Hierarchy PRIMARY KEY CLUSTERED,
Node hierarchyid NOT NULL,
[Level] as Node.GetLevel(),
Name varchar(50) NOT NULL
)
CREATE INDEX IX_Hierarchy_Node
ON Hierarchy (Node)
INCLUDE (Name)
CREATE INDEX IX_Hierarchy_NodeBF
ON Hierarchy ([Level], Node)
GO
INSERT Hierarchy (Node, Name)
SELECT CAST('/1/' AS hierarchyid), 'Alice' UNION ALL
SELECT CAST('/1/1/' AS hierarchyid), 'Bob' UNION ALL
SELECT CAST('/1/1/1/' AS hierarchyid), 'Charles' UNION ALL
SELECT CAST('/1/1/2/' AS hierarchyid), 'Dave' UNION ALL
SELECT CAST('/1/1/3/' AS hierarchyid), 'Ellen' UNION ALL
SELECT CAST('/1/2/' AS hierarchyid), 'Fred' UNION ALL
SELECT CAST('/1/3/' AS hierarchyid), 'Graham' UNION ALL
SELECT CAST('/1/3/1/' AS hierarchyid), 'Harold' UNION ALL
SELECT CAST('/1/3/2/' AS hierarchyid), 'Isabelle' UNION ALL
SELECT CAST('/1/4/' AS hierarchyid), 'John' UNION ALL
SELECT CAST('/2/' AS hierarchyid), 'Karen' UNION ALL
SELECT CAST('/2/1/' AS hierarchyid), 'Liam' UNION ALL
SELECT CAST('/2/2/' AS hierarchyid), 'Mary' UNION ALL
SELECT CAST('/2/2/1/' AS hierarchyid), 'Nigel' UNION ALL
SELECT CAST('/2/2/2/' AS hierarchyid), 'Oliver' UNION ALL
SELECT CAST('/2/3/' AS hierarchyid), 'Peter' UNION ALL
SELECT CAST('/2/3/1/' AS hierarchyid), 'Quinn'
GO
CREATE TYPE UniqueIntTable AS TABLE
(
Value int NOT NULL,
PRIMARY KEY (Value)
)
GO
COMMIT
GO
The above code/script can be used to create the CLR function/DB schema; use the same GoodFunction
and BadFunction
scripts in the original.
Haha, try this:
IF OBJECT_ID('_HappyFunction' ) IS NOT NULL DROP FUNCTION _HappyFunction
IF OBJECT_ID('_SadFunction' ) IS NOT NULL DROP FUNCTION _SadFunction
IF TYPE_ID ('_UniqueIntTable') IS NOT NULL DROP TYPE _UniqueIntTable
GO
CREATE TYPE _UniqueIntTable AS TABLE (Value int NOT NULL PRIMARY KEY)
GO
CREATE FUNCTION _HappyFunction (@IDs _UniqueIntTable READONLY)
RETURNS TABLE AS RETURN
SELECT Value FROM @IDs
GO
CREATE FUNCTION _SadFunction (@IDs _UniqueIntTable READONLY)
RETURNS TABLE AS RETURN
WITH CTE AS (SELECT Value FROM @IDs)
SELECT Value FROM CTE
GO
-- this will return an empty record set
DECLARE @IDs _UniqueIntTable
SELECT * FROM _HappyFunction(@IDs)
GO
-- this will hang
DECLARE @IDs _UniqueIntTable
SELECT * FROM _SadFunction(@IDs)
GO
Who would have guessed?
这篇关于奇异的性能问题:内联用户定义函数中的公用表表达式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!