INSERT INTO 减慢表值函数 [英] INSERT INTO slows down table-valued FUNCTION

查看:41
本文介绍了INSERT INTO 减慢表值函数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在 SQL Server 2008 端,我有一个表值函数,它接收 45k 的整数 id,合并为单个 VARBINARY(MAX),将它们拆分并作为表返回.SplitIds 最多需要 5 秒.正如我在估计执行计划中看到的那样 - 100% 是表插入".是否有可能以某种方式加速此功能?

On the SQL Server 2008 side I have the table-valued function, that receives 45k of integer ids merged into single VARBINARY(MAX), splits them and returns back as a table. SplitIds takes up to 5s. As I see in the estimated execution plan - 100% is 'Table Insert'. Is it possible somehow to speed up this function?

ALTER FUNCTION [dbo].[SplitIds](@data VARBINARY(MAX))
RETURNS @result TABLE(Id INT NOT NULL)
AS
BEGIN
    IF @data IS NULL
        RETURN
    DECLARE @ptr INT = 0, @size INT = 4
    WHILE @ptr * @size < LEN(@data)
    BEGIN
        INSERT INTO @result(Id)
        VALUES(SUBSTRING(@data, @ptr * @size + 1, @size))
        SET @ptr += 1
    END
    RETURN
END

目前在 C# 端,它以下一种方式用于 Linq-to-SQL 查询:

Currently on the C# side it is used in Linq-to-SQL query in the next way:

XDbOrder[] orders =
    database.SplitIds(ConvertToVarbinary(orderIds))
    Join(
        database.Get<XDbOrder>,
        r = r.Id,
        o => o.Id,
        (r, o) => o).
    ToArray();

更一般的问题:是否有可能在 Linq-to-SQL 中以某种方式在没有 SplitIds 的情况下实现下一件事?.Contains 不起作用 - 它使用超过 2100 个 SQL 参数创建查询并崩溃.

More general question: is it possible somehow in Linq-to-SQL to implement the next thing without SplitIds? .Contains does not work - it creates the query with more than 2100 SQL parameters and crashes.

int[] orderIds = { ... 45k random entries .....};

XDbOrder[] orders =
    database.Get<XDbOrder>().
    Where(o => orderIds.Contains(o.Id)).
    ToArray();

推荐答案

您可以尝试更固定的方法.

You could try a more set based approach.

(我保留了多语句 TVF 方法,因为生成数字表的内联方法在单独运行时效果很好,但是合并到更大查询中的执行计划可能非常糟糕 - 这确保拆分发生一次而且只有一次)

(I've kept the multi statement TVF approach because the inline approach to generating a table of numbers works well in isolation but the execution plans when incorporated into a larger query can be quite catastrophically bad - this ensures that the split happens once and only once)

我还在返回表中添加了一个主键,因此它包含一个有用的索引.

I've also added a Primary Key to the return table so it contains a useful index.

CREATE FUNCTION [dbo].[SplitIds](@data VARBINARY(MAX))
RETURNS @result TABLE(Id INT NOT NULL PRIMARY KEY WITH (IGNORE_DUP_KEY=ON))
AS
  BEGIN
      IF @data IS NULL
        RETURN

      DECLARE @size INT = 4;

      WITH E1(N)
           AS (SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL 
               SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL 
               SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1), -- 1*10^1 or 10 rows
           E2(N)
           AS (SELECT 1 FROM   E1 a, E1 b), -- 1*10^2 or 100 rows
           E4(N)
           AS (SELECT 1 FROM   E2 a, E2 b), -- 1*10^4 or 10,000 rows
           E8(N)
           AS (SELECT 1 FROM   E4 a, E4 b), -- 1*10^8 or 100,000,000 rows
           Nums(N)
           AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) - 1
               FROM   E8)
      INSERT INTO @result
                  (Id)
      SELECT TOP (DATALENGTH(@data)/@size) SUBSTRING(@data, N * @size + 1, @size)
      FROM   Nums

      RETURN
  END 

以下对我来说大约在 160 毫秒内完成

The following completes in about 160ms for me

DECLARE @data VARBINARY(MAX) = 0x

WHILE DATALENGTH(@data) < 184000
  SET @data = @data + CRYPT_GEN_RANDOM(8000)

SELECT COUNT(*)
FROM   [dbo].[SplitIds](@data) 

这篇关于INSERT INTO 减慢表值函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆