用户定义的功能性能缺点 [英] User defined function performance disadvantages

查看:103
本文介绍了用户定义的功能性能缺点的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个拥有大量UDF的数据库,其中包含一个涉及大量数据操作和计算的长时间运行过程。

我在使用UDF时的想法是将信息的逻辑单元从基础表中分离出来。例如,如果我试图获取关于一辆汽车的信息,我可能会有几张表,如颜色,型号,年份等,我将不得不加入每次获得汽车。相反,我会有一个像fnCar()这样的函数来获取数据的非规范化视图。

在我长时间运行的过程中,我会调用这些函数,而我想知道如果我有一个非规范化的工作表,视图或临时表来执行我的数据操作和计算,会不会更好。总体来说,我应该注意性能方面的UDF是否存在一些缺点?

例如,我使用UDF进行了一些计算。然后,我将这些数据取消转换并存储在一个表中。无论何时我需要再次使用这些数据,我都会调用UDF来将数据转回。我们这样做的原因是保持我们的计算灵活性。如果我们添加/删除/更改计算,我们不想更改数据模型。

   - 计算函数中的某些值

declare @location table

id int,
lattitude float,
longitude float


insert into @location select 1,40.7,74
insert into @location选择2,42,73
insert into @location select 3,61,149
insert into @location select 4,41,87


declare @myLattitude float
declare @myLongitude float
set @myLattitude = 43
set @myLongitude = 116

declare @distance表

id int,
距离浮动


插入@distance
选择id,sqrt(power(lattitude- @ mylattitude,2)+ power(longitude- @ mylongitude,2 ))
from @location



- 在表中存储未转换的数据
declare @unpivot表

id int,
属性varchar(100),
attributeValue float


插入@unpivot

id,
属性,
属性价值


中选择id
,属性
,属性值

选择
L.id,
.Lattitude,
L.Longitude,
D.距离@location的距离
L
内部连接@distance D
对L.id = D.id
)a
unpivot

attributeValue属性
(经度,距离)
)x

- 通过pivoting函数从商店检索数据以报告

从@unpivot中选择*



max(attributeValue)for Attribute in(lattitude,经度,距离)

)x


解决方案

我会尝试回答

简单来说:您对UDF做错了



当您使用UDF时,您将添加这些问题


  1. RBAR(见底部)处理

    当您在SELECT claus中使用具有表访问权限的标量UDF时e

    也就是说,不是一个有效的JOIN,而是强制一个表格每行*/ b>

  2. 声明TVFs

    每个TVF必须完成并被认为是一个黑匣子




  3. 通常 做的是加载一个平台,然后加入到查找表中,这个处理是作为一个集合完成的。如果这就是denormalised的意思,那么是的,它可能会更好。



    使用UDF作为逻辑信息单元是OO /程序思想。 SQL是基于设置的。对于通过查询优化器进行的基于集合的数据处理,在native / CLR代码中运行的对象或对象集合失败看起来没问题。



    注意:RBAR =通过Agonizing Row排。有关更多信息,请参见 Simple Talk的文章


    I have a database with a lot of UDFs that contains a long running process involving lots of data manipulation and calculations.

    My thinking in using UDFs is to separate out logical units of information from the tables underlying. For example, if i am trying to get information about a car i might have several tables like Color, Model, Year, etc that i would have to join each time to get a Car. Instead, I would have a function like fnCar() to get a denormalized view of the data.

    I call these functions a lot during my long running process and I'm wondering if it would be better if instead I had a denormalized working table,view, or temp table to do my data manipulation and calculations. Is there some disadvantage to using UDFs in general that I should be aware of in terms of performance?

    For example, I make some calculations using a UDF. I then unpivot that data and store in a table. Whenever i need to use that data again, I call a UDF to pivot the data back out. The reason we do it this way is to keep our calculations flexible. We don't want to change the data model if we add/remove/change the calculations.

    --Calculate some values in a function
    
    declare @location table
    (
        id int,
        lattitude float,
        longitude float
    )
    
    insert into @location select  1, 40.7, 74
    insert into @location select  2, 42, 73
    insert into @location select  3, 61, 149
    insert into @location select  4, 41, 87
    
    
    declare @myLattitude float
    declare @myLongitude float
    set @myLattitude =43
    set @myLongitude = 116
    
    declare @distance table
    (
        id int,
        distance float
    )
    
    insert into @distance
    select id, sqrt(power(lattitude-@mylattitude,2)+power(longitude-@mylongitude,2))
    from @location
    
    
    
    --Store unpivoted data in a table
    declare @unpivot table
    (
        id int,
        attribute varchar(100),
        attributeValue float
    )
    
    insert into @unpivot
    (
        id,
        attribute,
        attributeValue
    )
    select id
        ,attribute
        ,attributevalue 
    from
    (
        select 
            L.id,
            L.Lattitude, 
            L.Longitude,
            D.Distance
        from @location L 
            inner join @distance D 
            on L.id=D.id
    ) a
    unpivot 
    (
        attributeValue for attribute in
        (lattitude, longitude, distance)
    ) x
    
    --retrive data from store via pivoting function for reporting
    
    select * 
    from @unpivot
    pivot 
    (
        max(attributeValue) for Attribute in (lattitude, longitude, distance)
    
    ) x
    

    解决方案

    I'll attempt an answer

    Simply: You are doing it wrong with UDFs

    When you use UDFs, then you add these problems

    1. RBAR (see bottom) processing
      When you use scalar UDFs with table access in the SELECT clause
      That is, instead of an efficient JOIN, you force a table lookip *per row"

    2. Black box processing with multi-statement TVFs
      Each TVF has to run to completion and is considered a "black box"

    What you normally do is to load a flat staging table and then JOIN to lookup tables the processing is done as a set. If this is what you mean by "denormalised" then yes, it probably works better.

    Using UDFs for "logical units of information" is OO/Procedural thinking. SQL is set based. What appears OK for an object or collection of objects running in native/CLR code fails for set based data processing via a query optimiser.

    Note: RBAR = Row By Agonising Row. For more, see Simple Talk's article

    这篇关于用户定义的功能性能缺点的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆