在 T-SQL 中设置和查询版本记录的最佳实践 [英] Best practice for setup and querying versioned records in T-SQL

查看:21
本文介绍了在 T-SQL 中设置和查询版本记录的最佳实践的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试优化我的 SQL 查询,我总是回到这个问题,我希望深入了解如何最好地优化它.

I'm trying to optimize my SQL queries and I always come back to this one issue and I was hoping to get some insight into how I could best optimize this.

为简洁起见,假设我有一个简单的员工表:

For brevity, lets say I have a simple employee table:

tbl_employees

Id     HiredDateTime
------------------
1      ...        
2      ...      

在另一个表中为每个员工提供版本信息:

That has versioned information in another another table for each employee:

tbl_emplyees_versioned

Id     Version   Name     HourlyWage
-------------------------------
1      1         Bob      10
1      2         Bob      20
1      3         Bob      30
2      1         Dan      10
2      2         Dan      20

这是在视图中检索最新版本记录的方式:

And this is how the latest version records are retrieved in a View:

Select tbl_employees.Id, employees_LatestVersion.Name, employees_LatestVersion.HourlyWage, employees_LatestVersion.Version
From tbl_employees
Inner Join tbl_employees_versioned
 ON tbl_employees.Id = tbl_employees_versioned.Id
CROSS APPLY 
   (SELECT Id, Max(Version) AS Version
    FROM tbl_employees_versioned AS employees_LatestVersion
    WHERE Id = tbl_employees_versioned.Id
    GROUP BY Id) AS employees_LatestVersion

要得到这样的回复:

Id     Version   Name     HourlyWage
-------------------------------
1      3         Bob      30
2      2         Dan      20

当提取包含超过 500 条员工记录且每个记录有几个版本的查询时,此查询开始卡住并需要几秒钟才能运行.

When pulling a query that has over 500 employees records for which each have a couple few versions, this query starts choking up and takes a few seconds to run.

马上就遇到了几次打击,但我不知道如何克服它们.

There are a couple strikes right off the bat, but I'm not sure how to overcome them.

  1. 显然交叉应用会增加一些性能损失.处理这样的版本信息时是否有最佳实践?有没有更好的方法来获得最高版本的记录?

  1. Obviously the Cross Apply adds some performance loss. Is there a best practice when dealing with versioned information like this? Is there a better way to get just a record with the highest version?

版本化表没有聚集索引,因为 Id 或 Version 都不是唯一的.将它们连接在一起,但它不会那样工作.相反,Id 有一个非聚集索引,Version 有另一个索引.有没有更好的方法来索引这个表以获得任何性能增益?索引视图真的有用吗?

The versioned table doesn't have a clustered index beause neither Id or Version are unique. Concatenated together they would be, but it doesn't work like that. Instead there is a non-clustered index for Id and another one for Version. Is there a better way to index this table to get any performance gain? Would an indexed view really help here?

推荐答案

我认为构建数据的最佳方式是使用开始日期和结束日期.因此,原始表的数据结构如下所示:

I think the best way to structure the data is using start dates and end dates. So, the data structure for your original table would look like:

create table tbl_EmployeesHistory (
    EmployeeHistoryId int,
    EffDate date not null,
    EndDate date,
    -- Fields that describe the employee during this time
)

然后,您可以使用视图查看当前版本:

Then, you can see the current version using a view:

create view vw_Employees as
    select *
    from tbl_EmployeesHistory
    where EndDate is NULL

在某些情况下,如果允许未来结束日期,则 where 子句将是:

In some cases, where future end dates are allowed, the where clause would be:

where coalesce(EndDate, getdate()) >= getdate()

或者,在这种情况下,您可以将 EndDate 设置为未来某个很远很远的日期,例如01-o1-9999".您可以将其添加为 create table 语句中的默认值,使该列不为 null,然后您就可以始终使用该语句:

Alternatively, in this case, you can default EndDate to some future date far, far away such as '01-o1-9999'. You would add this as the default in the create table statement, make the column not null, and then you can always use the statement:

where getdate() between EffDate and EndDate

正如 Martin 在他的评论中指出的那样,coalesce() 可能会妨碍索引的使用(在 SQL Server 中是这样),而这没有这个问题.

As Martin points out in his comment, the coalesce() might impede the use of an index (it does in SQL Server), whereas this does not have that problem.

这称为缓慢变化的维度.Ralph Kimball 在他关于数据仓库的书中详细讨论了这个概念.

This is called a slowly changing dimension. Ralph Kimball discusses this concept in some length in his books on data warehousing.

这篇关于在 T-SQL 中设置和查询版本记录的最佳实践的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆