为什么 UDF 比子查询慢这么多? [英] Why is a UDF so much slower than a subquery?

查看:41
本文介绍了为什么 UDF 比子查询慢这么多?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个案例,我需要从同一个表中翻译(查找)多个值.我编写它的第一种方法是使用子查询:

I have a case where I need to translate (lookup) several values from the same table. The first way I wrote it, was using subqueries:

SELECT
    (SELECT id FROM user WHERE user_pk = created_by) AS creator,
    (SELECT id FROM user WHERE user_pk = updated_by) AS updater,
    (SELECT id FROM user WHERE user_pk = owned_by) AS owner,
    [name]
FROM asset

因为我经常使用这个子查询(也就是说,我有大约 50 个带有这些字段的表),我可能需要向子查询添加更多代码(例如,AND active = 1")我想我会将这些放入用户定义的函数 UDF 中并使用它.但是使用该 UDF 的性能非常糟糕.

As I'm using this subquery a lot (that is, I have about 50 tables with these fields), and I might need to add some more code to the subquery (for example, "AND active = 1" ) I thought I'd put these into a user-defined function UDF and use that. But the performance using that UDF was abysmal.

CREATE FUNCTION dbo.get_user ( @user_pk INT )
RETURNS INT
AS BEGIN 
    RETURN ( SELECT id
             FROM   ice.dbo.[user]
             WHERE  user_pk = @user_pk )
END

SELECT dbo.get_user(created_by) as creator, [name]
FROM asset

#1 的性能不到 1 秒.#2 的性能大约是 30 秒...

The performance of #1 is less than 1 second. Performance of #2 is about 30 seconds...

为什么,或者更重要的是,有什么方法可以在 SQL Server 2008 中编写代码,这样我就不必使用这么多子查询?

Why, or more importantly, is there any way I can code in SQL server 2008, so that I don't have to use so many subqueries?

仅对何时有用的更多解释.当我想要为用户获取文本时,这个简单的查询(即获取用户 ID)变得更加复杂,因为我必须加入配置文件才能获取语言,并与公司一起查看是否应该获取语言"从那里开始编辑,并使用翻译表获取翻译文本.对于大多数此类查询,性能是可读性和可维护性的次要问题.

Just a litte more explanation of when this is useful. This simple query (that is, get userid) gets a lot more complex when I want to have a text for a user, since I have to join with profile to get the language, with a company to see if the language should be fetch'ed from there instead, and with the translation table to get the translated text. And for most of these queries, performance is a secondary issue to readability and maintainability.

推荐答案

UDF 是查询优化器的黑匣子,因此它对每一行执行.您正在执行逐行游标.对于资产中的每一行,在另一个表中查找 id 三次.当您使用标量或多语句 UDF 时会发生这种情况(内联 UDF 只是扩展到外部查询中的宏)

The UDF is a black box to the query optimiser so it's executed for every row. You are doing a row-by-row cursor. For each row in an asset, look up an id three times in another table. This happens when you use scalar or multi-statement UDFs (In-line UDFs are simply macros that expand into the outer query)

关于这个问题的许多文章之一是标量函数、内联和性能:无聊帖子的有趣标题".

One of many articles on the problem is "Scalar functions, inlining, and performance: An entertaining title for a boring post".

可以优化子查询以关联并避免逐行操作.

The sub-queries can be optimised to correlate and avoid the row-by-row operations.

你真正想要的是:

SELECT
   uc.id AS creator,
   uu.id AS updater,
   uo.id AS owner,
   a.[name]
FROM
    asset a
    JOIN
    user uc ON uc.user_pk = a.created_by
    JOIN
    user uu ON uu.user_pk = a.updated_by
    JOIN
    user uo ON uo.user_pk = a.owned_by

2019 年 2 月更新

Update Feb 2019

SQL Server 2019 开始修复这个问题.

SQL Server 2019 starts to fix this problem.

这篇关于为什么 UDF 比子查询慢这么多?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆