SQL Server INSERT 插入大表很慢 [英] SQL Server INSERT into huge table is slow

查看:47
本文介绍了SQL Server INSERT 插入大表很慢的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 SQl Server 2008 数据库中有一个大表,它有大约 5.7 亿条记录.每天我们都运行一个批处理作业,它接收一个包含 200,000 条左右事务记录的文件,对这些数据进行分组和求和,然后将其插入到大表中.

I have a large table in a SQl Server 2008 database, it has about 570 million records. Every day we run a batch job that takes a file of 200,000 or so transaction records, does a group by and sum against this data and inserts it into the large table.

最近我尝试将大表的聚集索引更改为标识 int 列,这使插入时间从 3 小时减少到 1 小时,但我仍然感到困惑,为什么这个简单的查询要花这么长时间才能运行(不管桌子的大小)

Recently I have experimented with changing the clustered index of the large table to an identity int column, which has brought the insert down from 3 hours to one hour, but I am still puzzled why this simple query should take so long to run (regardless of the size of the table)

这是一个有 5.7 亿行的表格

This is the table with 570 million rows

CREATE TABLE [dbo].[POINTS_EARNED](
[POINTS_EARNED_ID]int identity not null,
[CARD_ID] [int] NOT NULL,
[CYCLE_ID] [int] NOT NULL,
[POINTS_CODE] [int] NOT NULL,
[NO_POINTS] [int] NULL,
[ACCOUNT_ID] [int] NOT NULL,
[CREATED_DATE] [datetime] NULL,
[CREATED_BY] [varchar](20) NULL,
[LAST_MODIFIED_DATE] [datetime] NULL,
[LAST_MODIFIED_BY] [varchar](20) NULL,
[DELETED] [bit] NULL,
CONSTRAINT [PK_POINTS_EARNED] PRIMARY KEY CLUSTERED 
(
[POINTS_EARNED_ID] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, 
IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]

) ON [PRIMARY]

它还具有一些约束(默认值和外键)和索引,我想知道这些是否是导致问题的原因.

It also has some constraints (defaults and foreign keys) and indexes, and I am wondering if these are what is causing the problem.

实际运行一个小时的 SQL 是:

The actual SQL that takes an hour to run is:

insert into points_earned (
    card_id,
    cycle_id,
    points_code,
    no_points,
    account_id
    ) 

select pe.card_id, pe.cycle_id, pe.points_code, sum(pe.no_points),pe.account_id
from #points_earned pe 
group by pe.card_id, pe.cycle_id, pe.points_code,pe.account_id

临时表 #points_earned 大约有 200,000 行,并且具有以下结构(没有索引)

and the temp table #points_earned has about 200,000 rows, and has the following structure (with no indexes)

create table #points_earned (
        card_id           int,
        cycle_id    int,
        points_code int,
        card_type   varchar(5),
        no_points   int,
        account_id int
        )

所以,我想对我是否应该提出一些意见

So, I would like some opinions on whether I should

  • 在临时表上添加索引
  • 在添加数据之前删除大表上的非聚集索引,然后重新创建它们
  • 还有其他选择吗?

更新 - 根据要求提供更多信息- select 语句在 2 秒内没有插入的情况下运行,所以这似乎不是问题,所以可能不需要担心索引临时表

Update - as requested a bit more info - The select statement runs without the insert in 2 seconds, so this doesn't appear to be the problem, so probably don't need o worry about indexing the temp table

索引、(更新)触发器和约束是:

Indexes, (update) trigger, and constraints are:

    CREATE NONCLUSTERED INDEX [IDX_CYCLE_ID] ON [dbo].[POINTS_EARNED] 
(
    [CYCLE_ID] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [IX_ACCOUNT_ID] ON [dbo].[POINTS_EARNED] 
(
    [ACCOUNT_ID] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [IX_ACCOUNT_ID_POINTS_CODE] ON [dbo].[POINTS_EARNED] 
(
    [ACCOUNT_ID] ASC,
    [POINTS_CODE] ASC
)
INCLUDE ( [CARD_ID],
[CYCLE_ID],
[NO_POINTS]) WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [RELATION_151_FK] ON [dbo].[POINTS_EARNED] 
(
    [CARD_ID] ASC,
    [CYCLE_ID] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [RELATION_152_FK] ON [dbo].[POINTS_EARNED] 
(
    [POINTS_CODE] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO
/****** Object:  Trigger [update_points_earned]    Script Date: 09/13/2013 13:20:54 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TRIGGER [dbo].[update_points_earned] ON [dbo].[POINTS_EARNED]
FOR UPDATE
AS
 BEGIN
  UPDATE points_earned 
  SET Last_Modified_By = USER,
   Last_Modified_Date = GETDATE()  
  FROM
   points_earned t,
   inserted i
  WHERE
   t.card_id = i.card_id AND
   t.cycle_id = i.cycle_id AND
   t.points_code = i.points_code AND
   t.account_id = i.account_id
 END
GO
/****** Object:  Default [DF_POINTS_EARNED_ACCOUNT_ID]    Script Date: 09/13/2013 13:20:54 ******/
ALTER TABLE [dbo].[POINTS_EARNED] ADD  CONSTRAINT [DF_POINTS_EARNED_ACCOUNT_ID]  DEFAULT ((0)) FOR [ACCOUNT_ID]
GO
/****** Object:  Default [DF_POINTS_EARNED_CREATED_DATE]    Script Date: 09/13/2013 13:20:54 ******/
ALTER TABLE [dbo].[POINTS_EARNED] ADD  CONSTRAINT [DF_POINTS_EARNED_CREATED_DATE]  DEFAULT (getdate()) FOR [CREATED_DATE]
GO
/****** Object:  Default [DF_POINTS_EARNED_CREATED_BY]    Script Date: 09/13/2013 13:20:54 ******/
ALTER TABLE [dbo].[POINTS_EARNED] ADD  CONSTRAINT [DF_POINTS_EARNED_CREATED_BY]  DEFAULT (user_name()) FOR [CREATED_BY]
GO
/****** Object:  Default [DF_POINTS_EARNED_DELETED]    Script Date: 09/13/2013 13:20:54 ******/
ALTER TABLE [dbo].[POINTS_EARNED] ADD  CONSTRAINT [DF_POINTS_EARNED_DELETED]  DEFAULT ((0)) FOR [DELETED]
GO
/****** Object:  ForeignKey [FK_POINTS_E_REFERENCE_CYCLE_CA]    Script Date: 09/13/2013 13:20:54 ******/
ALTER TABLE [dbo].[POINTS_EARNED]  WITH CHECK ADD  CONSTRAINT [FK_POINTS_E_REFERENCE_CYCLE_CA] FOREIGN KEY([CARD_ID], [CYCLE_ID])
REFERENCES [dbo].[CYCLE_CARD] ([CARD_ID], [CYCLE_ID])
GO
ALTER TABLE [dbo].[POINTS_EARNED] CHECK CONSTRAINT [FK_POINTS_E_REFERENCE_CYCLE_CA]
GO
/****** Object:  ForeignKey [FK_POINTS_E_REFERENCE_POINTS_C]    Script Date: 09/13/2013 13:20:54 ******/
ALTER TABLE [dbo].[POINTS_EARNED]  WITH NOCHECK ADD  CONSTRAINT [FK_POINTS_E_REFERENCE_POINTS_C] FOREIGN KEY([POINTS_CODE])
REFERENCES [dbo].[POINTS_CODE] ([POINTS_CODE])
GO
ALTER TABLE [dbo].[POINTS_EARNED] CHECK CONSTRAINT [FK_POINTS_E_REFERENCE_POINTS_C]
GO
/****** Object:  ForeignKey [FK_POINTS_EARNED_REF_ACCOUNT]    Script Date: 09/13/2013 13:20:54 ******/
ALTER TABLE [dbo].[POINTS_EARNED]  WITH NOCHECK ADD  CONSTRAINT [FK_POINTS_EARNED_REF_ACCOUNT] FOREIGN KEY([ACCOUNT_ID])
REFERENCES [dbo].[ACCOUNT] ([ACCOUNT_ID])
GO
ALTER TABLE [dbo].[POINTS_EARNED] CHECK CONSTRAINT [FK_POINTS_EARNED_REF_ACCOUNT]

编辑2,插入语句的查询计划

  |--Sequence
   |--Index Insert(OBJECT:([Progressive_Points].[dbo].[POINTS_EARNED].[IDX_CYCLE_ID]), SET:([POINTS_EARNED_ID1040] = [Progressive_Points].[dbo].[POINTS_EARNED].[POINTS_EARNED_ID],[CYCLE_ID1041] = [Progressive_Points].[dbo].[POINTS_EARNED].[CYCLE_ID]) WITH ORDERED PREFETCH)
   |    |--Sort(ORDER BY:([Progressive_Points].[dbo].[POINTS_EARNED].[CYCLE_ID] ASC, [Progressive_Points].[dbo].[POINTS_EARNED].[POINTS_EARNED_ID] ASC))
   |         |--Table Spool
   |              |--Clustered Index Insert(OBJECT:([Progressive_Points].[dbo].[POINTS_EARNED].[PK_POINTS_EARNED]), SET:([Progressive_Points].[dbo].[POINTS_EARNED].[CARD_ID] = RaiseIfNullInsert([tempdb].[dbo].[#points_earned].[card_id] as [pe].[card_id]),[Progressive_Points].[dbo].[POINTS_EARNED].[CYCLE_ID] = RaiseIfNullInsert([tempdb].[dbo].[#points_earned].[cycle_id] as [pe].[cycle_id]),[Progressive_Points].[dbo].[POINTS_EARNED].[POINTS_CODE] = RaiseIfNullInsert([tempdb].[dbo].[#points_earned].[points_code] as [pe].[points_code]),[Progressive_Points].[dbo].[POINTS_EARNED].[NO_POINTS] = [Expr1006],[Progressive_Points].[dbo].[POINTS_EARNED].[ACCOUNT_ID] = RaiseIfNullInsert([tempdb].[dbo].[#points_earned].[account_id] as [pe].[account_id]),[Progressive_Points].[dbo].[POINTS_EARNED].[POINTS_EARNED_ID] = [Expr1007],[Progressive_Points].[dbo].[POINTS_EARNED].[CREATED_DATE] = [Expr1008],[Progressive_Points].[dbo].[POINTS_EARNED].[CREATED_BY] = [Expr1009],[Progressive_Points].[dbo].[POINTS_EARNED].[DELETED] = [Expr1010],[Progressive_Points].[dbo].[POINTS_EARNED].[LAST_MODIFIED_DATE] = NULL,[Progressive_Points].[dbo].[POINTS_EARNED].[LAST_MODIFIED_BY] = NULL) WITH UNORDERED PREFETCH)
   |                   |--Compute Scalar(DEFINE:([Expr1008]=getdate(), [Expr1009]=CONVERT_IMPLICIT(varchar(20),user_name(),0), [Expr1010]=(0)))
   |                        |--Compute Scalar(DEFINE:([Expr1007]=getidentity((1243867498),(8),NULL)))
   |                             |--Top(ROWCOUNT est 0)
   |                                  |--Parallelism(Gather Streams)
   |                                       |--Compute Scalar(DEFINE:([Expr1006]=CASE WHEN [Expr1062]=(0) THEN NULL ELSE [Expr1063] END))
   |                                            |--Hash Match(Aggregate, HASH:([pe].[card_id], [pe].[cycle_id], [pe].[points_code], [pe].[account_id]), RESIDUAL:([tempdb].[dbo].[#points_earned].[card_id] as [pe].[card_id] = [tempdb].[dbo].[#points_earned].[card_id] as [pe].[card_id] AND [tempdb].[dbo].[#points_earned].[cycle_id] as [pe].[cycle_id] = [tempdb].[dbo].[#points_earned].[cycle_id] as [pe].[cycle_id] AND [tempdb].[dbo].[#points_earned].[points_code] as [pe].[points_code] = [tempdb].[dbo].[#points_earned].[points_code] as [pe].[points_code] AND [tempdb].[dbo].[#points_earned].[account_id] as [pe].[account_id] = [tempdb].[dbo].[#points_earned].[account_id] as [pe].[account_id]) DEFINE:([Expr1062]=COUNT_BIG([tempdb].[dbo].[#points_earned].[no_points] as [pe].[no_points]), [Expr1063]=SUM([tempdb].[dbo].[#points_earned].[no_points] as [pe].[no_points])))
   |                                                 |--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:([pe].[card_id], [pe].[cycle_id], [pe].[points_code], [pe].[account_id]))
   |                                                      |--Clustered Index Scan(OBJECT:([tempdb].[dbo].[#points_earned] AS [pe]))
   |--Index Insert(OBJECT:([Progressive_Points].[dbo].[POINTS_EARNED].[IX_ACCOUNT_ID]), SET:([POINTS_EARNED_ID1042] = [Progressive_Points].[dbo].[POINTS_EARNED].[POINTS_EARNED_ID],[ACCOUNT_ID1043] = [Progressive_Points].[dbo].[POINTS_EARNED].[ACCOUNT_ID]) WITH ORDERED PREFETCH)
   |    |--Sort(ORDER BY:([Progressive_Points].[dbo].[POINTS_EARNED].[ACCOUNT_ID] ASC, [Progressive_Points].[dbo].[POINTS_EARNED].[POINTS_EARNED_ID] ASC))
   |         |--Table Spool
   |--Assert(WHERE:(CASE WHEN [Expr1050] IS NULL THEN (0) ELSE NULL END))
   |    |--Nested Loops(Left Semi Join, OUTER REFERENCES:([Progressive_Points].[dbo].[POINTS_EARNED].[ACCOUNT_ID], [Expr1068]) WITH UNORDERED PREFETCH, DEFINE:([Expr1050] = [PROBE VALUE]))
   |         |--Index Insert(OBJECT:([Progressive_Points].[dbo].[POINTS_EARNED].[IX_ACCOUNT_ID_POINTS_CODE]), SET:([POINTS_EARNED_ID1044] = [Progressive_Points].[dbo].[POINTS_EARNED].[POINTS_EARNED_ID],[CARD_ID1045] = [Progressive_Points].[dbo].[POINTS_EARNED].[CARD_ID],[CYCLE_ID1046] = [Progressive_Points].[dbo].[POINTS_EARNED].[CYCLE_ID],[POINTS_CODE1047] = [Progressive_Points].[dbo].[POINTS_EARNED].[POINTS_CODE],[NO_POINTS1048] = [Progressive_Points].[dbo].[POINTS_EARNED].[NO_POINTS],[ACCOUNT_ID1049] = [Progressive_Points].[dbo].[POINTS_EARNED].[ACCOUNT_ID]) WITH UNORDERED PREFETCH)
   |         |    |--Table Spool
   |         |--Clustered Index Seek(OBJECT:([Progressive_Points].[dbo].[ACCOUNT].[PK_ACCOUNT]), SEEK:([Progressive_Points].[dbo].[ACCOUNT].[ACCOUNT_ID]=[Progressive_Points].[dbo].[POINTS_EARNED].[ACCOUNT_ID]) ORDERED FORWARD)
   |--Assert(WHERE:(CASE WHEN [Expr1054] IS NULL THEN (0) ELSE NULL END))
   |    |--Nested Loops(Left Semi Join, OUTER REFERENCES:([Progressive_Points].[dbo].[POINTS_EARNED].[CARD_ID], [Progressive_Points].[dbo].[POINTS_EARNED].[CYCLE_ID], [Expr1070]) WITH UNORDERED PREFETCH, DEFINE:([Expr1054] = [PROBE VALUE]))
   |         |--Index Insert(OBJECT:([Progressive_Points].[dbo].[POINTS_EARNED].[RELATION_151_FK]), SET:([POINTS_EARNED_ID1051] = [Progressive_Points].[dbo].[POINTS_EARNED].[POINTS_EARNED_ID],[CARD_ID1052] = [Progressive_Points].[dbo].[POINTS_EARNED].[CARD_ID],[CYCLE_ID1053] = [Progressive_Points].[dbo].[POINTS_EARNED].[CYCLE_ID]) WITH ORDERED PREFETCH)
   |         |    |--Sort(ORDER BY:([Progressive_Points].[dbo].[POINTS_EARNED].[CARD_ID] ASC, [Progressive_Points].[dbo].[POINTS_EARNED].[CYCLE_ID] ASC, [Progressive_Points].[dbo].[POINTS_EARNED].[POINTS_EARNED_ID] ASC))
   |         |         |--Table Spool
   |         |--Row Count Spool
   |              |--Index Seek(OBJECT:([Progressive_Points].[dbo].[CYCLE_CARD].[IDX_NCLST_CARD_ID_CYCLE_ID]), SEEK:([Progressive_Points].[dbo].[CYCLE_CARD].[CARD_ID]=[Progressive_Points].[dbo].[POINTS_EARNED].[CARD_ID] AND [Progressive_Points].[dbo].[CYCLE_CARD].[CYCLE_ID]=[Progressive_Points].[dbo].[POINTS_EARNED].[CYCLE_ID]) ORDERED FORWARD)
   |--Assert(WHERE:(CASE WHEN [Expr1057] IS NULL THEN (0) ELSE NULL END))
        |--Merge Join(Left Semi Join, MANY-TO-MANY MERGE:([Progressive_Points].[dbo].[POINTS_EARNED].[POINTS_CODE])=([Progressive_Points].[dbo].[POINTS_CODE].[POINTS_CODE]), RESIDUAL:([Progressive_Points].[dbo].[POINTS_EARNED].[POINTS_CODE]=[Progressive_Points].[dbo].[POINTS_CODE].[POINTS_CODE]))
             |--Index Insert(OBJECT:([Progressive_Points].[dbo].[POINTS_EARNED].[RELATION_152_FK]), SET:([POINTS_EARNED_ID1055] = [Progressive_Points].[dbo].[POINTS_EARNED].[POINTS_EARNED_ID],[POINTS_CODE1056] = [Progressive_Points].[dbo].[POINTS_EARNED].[POINTS_CODE]) WITH ORDERED PREFETCH)
             |    |--Sort(ORDER BY:([Progressive_Points].[dbo].[POINTS_EARNED].[POINTS_CODE] ASC, [Progressive_Points].[dbo].[POINTS_EARNED].[POINTS_EARNED_ID] ASC))
             |         |--Table Spool
             |--Index Scan(OBJECT:([Progressive_Points].[dbo].[POINTS_CODE].[POINTS_CODES_PK]), ORDERED FORWARD)

推荐答案

好的,我会这样做:

  1. 检查您是否需要索引 [IX_ACCOUNT_ID_POINTS_CODE] 和 [IX_ACCOUNT_ID],因为它们可能是多余的.

  1. Check to see if you need both indexes [IX_ACCOUNT_ID_POINTS_CODE] and [IX_ACCOUNT_ID] as they may be redundant.

在执行 INSERT 之前,禁用触发器并删除外键.

Before you do the INSERT, Disable the Trigger and drop the Foreign Keys.

执行 INSERT 设置通常由触发器设置的字段,并确保 FK 列的值有效.

Do the INSERT setting the fields normally set by the Trigger, and insuring that the FK Column's values are valid.

重新启用触发器,并使用 NOCHECK 重新创建外键.

Re-Enable the trigger, and re-create the Foreign Keys WITH NOCHECK.

当您插入的行数少于总行数的 0.2% 时,我会保留索引,因此就地更新它们可能比删除和重建它们更快.

I would leave the indexes on as you are inserting less than 0.2% of the total row count so it's probably faster to update them in-place rather than to drop and rebuild them.

这篇关于SQL Server INSERT 插入大表很慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆