为什么 SQL Server 标量值函数变慢? [英] Why do SQL Server Scalar-valued functions get slower?

查看:43
本文介绍了为什么 SQL Server 标量值函数变慢?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

为什么标量值函数似乎会导致查询在连续使用次数越多时累积运行越慢?

Why do Scalar-valued functions seem to cause queries to run cumulatively slower the more times in succession that they are used?

我的这张表是用从 3rd 方购买的数据构建的.

I have this table that was built with data purchased from a 3rd party.

我删减了一些内容以使这篇文章更短......但只是为了让您了解事情是如何设置的.

I've trimmed out some stuff to make this post shorter... but just so you get the idea of how things are setup.

CREATE TABLE [dbo].[GIS_Location](
        [ID] [int] IDENTITY(1,1) NOT NULL, --PK
        [Lat] [int] NOT NULL,
        [Lon] [int] NOT NULL,
        [Postal_Code] [varchar](7) NOT NULL,
        [State] [char](2) NOT NULL,
        [City] [varchar](30) NOT NULL,
        [Country] [char](3) NOT NULL,

CREATE TABLE [dbo].[Address_Location](
    [ID] [int] IDENTITY(1,1) NOT NULL, --PK
    [Address_Type_ID] [int] NULL,
    [Location] [varchar](100) NOT NULL,
    [State] [char](2) NOT NULL,
    [City] [varchar](30) NOT NULL,
    [Postal_Code] [varchar](10) NOT NULL,
    [Postal_Extension] [varchar](10) NULL,
    [Country_Code] [varchar](10) NULL,

然后我有两个函数来查找 LAT 和 LON.

Then I have two functions that look up LAT and LON.

CREATE FUNCTION [dbo].[usf_GIS_GET_LAT]
(
    @City VARCHAR(30),
    @State CHAR(2)
)
RETURNS INT 
WITH EXECUTE AS CALLER
AS
BEGIN
    DECLARE @LAT INT

    SET @LAT = (SELECT TOP 1 LAT FROM GIS_Location WITH(NOLOCK) WHERE [State] = @State AND [City] = @City)

RETURN @LAT
END


CREATE FUNCTION [dbo].[usf_GIS_GET_LON]
(
    @City VARCHAR(30),
    @State CHAR(2)
)
RETURNS INT 
WITH EXECUTE AS CALLER
AS
BEGIN
    DECLARE @LON INT

    SET @LON = (SELECT TOP 1 LON FROM GIS_Location WITH(NOLOCK) WHERE [State] = @State AND [City] = @City)

RETURN @LON
END

当我运行以下...

SET STATISTICS TIME ON

SELECT
    dbo.usf_GIS_GET_LAT(City,[State]) AS Lat,
    dbo.usf_GIS_GET_LON(City,[State]) AS Lon
FROM
    Address_Location WITH(NOLOCK)
WHERE
    ID IN (SELECT TOP 100 ID FROM Address_Location WITH(NOLOCK) ORDER BY ID DESC)

SET STATISTICS TIME OFF

100 ~= 8 毫秒,200 ~= 32 毫秒,400 ~= 876 毫秒

100 ~= 8 ms, 200 ~= 32 ms, 400 ~= 876 ms

--编辑对不起,我应该更清楚.我不打算调整上面列出的查询.这只是一个示例,显示执行时间越慢,它处理的记录越多.在实际应用中,这些函数被用作 where 子句的一部分,以围绕城市和州建立半径,以包含该地区的所有记录.

--Edit Sorry I should have been more clear. I'm not looking to tune the query listed above. This is just a sample to show the execution time getting slower the more records it crunches through. In the real world application the functions are used as part of a where clause to build a radius around a city and state to include all records with in that region.

推荐答案

在大多数情况下,最好避免引用表的标量值函数,因为(正如其他人所说)它们基本上是黑匣子,需要每次运行一次行,并且不能被查询计划引擎优化.因此,即使关联的表有索引,它们也倾向于线性扩展.

In most cases, it's best to avoid scalar valued functions that reference tables because (as others said) they are basically black boxes that need to be ran once for every row, and cannot be optimized by the query plan engine. Therefore, they tend to scale linearly even if the associated tables have indexes.

您可能需要考虑使用内联表值函数,因为它们是与查询内联计算的,并且可以进行优化.你得到了你想要的封装,但是在 select 语句中粘贴表达式的性能.

You may want to consider using an inline-table-valued function, since they are evaluated inline with the query, and can be optimized. You get the encapsulation you want, but the performance of pasting the expressions right in the select statement.

作为内联的副作用,它们不能包含任何程序代码(没有声明@variable;设置@variable = ..;返回).但是,它们可以返回多行和多列.

As a side effect of being inlined, they can't contain any procedural code (no declare @variable; set @variable = ..; return). However, they can return several rows and columns.

你可以像这样重写你的函数:

You could re-write your functions something like this:

create function usf_GIS_GET_LAT(
    @City varchar (30),
    @State char (2)
)
returns table
as return (
  select top 1 lat
  from GIS_Location with (nolock) 
  where [State] = @State
    and [City] = @City
);

GO

create function usf_GIS_GET_LON (
    @City varchar (30),
    @State char (2)
)
returns table
as return (
  select top 1 LON
  from GIS_Location with (nolock)
  where [State] = @State
    and [City] = @City
);

使用它们的语法也有点不同:

The syntax to use them is also a little different:

select
    Lat.Lat,
    Lon.Lon
from
    Address_Location with (nolock)
    cross apply dbo.usf_GIS_GET_LAT(City,[State]) AS Lat
    cross apply dbo.usf_GIS_GET_LON(City,[State]) AS Lon
WHERE
    ID IN (SELECT TOP 100 ID FROM Address_Location WITH(NOLOCK) ORDER BY ID DESC)

这篇关于为什么 SQL Server 标量值函数变慢?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆