如何最好地管理数据库中的历史查找值? [英] How to best manage historical lookup values in a database?

查看:75
本文介绍了如何最好地管理数据库中的历史查找值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

概述

一个事件数据库,其中将具有许多列,这些列包含查找表中保存的记录的ID.

An incident database that will have a number of columns holding an ID for a record held in a lookup table.

我要解决的问题

我需要提供一个健壮的解决方案来管理历史数据,其中某些字段包含查找ID.我已经列出了我建议的解决方案以及替代方案.我想从其他开发人员那里了解他们是否以类似的方式在他们的项目中管理这些方案.也许您有更好的方法?

I need to come up with a robust solution to manage historical data where some fields hold lookup IDs. I've listed my proposed solutions as well as alternatives. I would like to know from other developers if they manage these scenarios in a similar way in their projects. Perhaps you have a better approach?

数据库:Oracle 10g

Database: Oracle 10g

:部门名称

场景:部门名称在一年中可以更改X倍.该企业需要报告其所有部门的数据,但希望以事件发生时的各自部门名称查看事件.

Scenario: The department name can change X amount of times through the year. The business have a need to report data for all their departments but wish to see incidents under their respective department names as it was at the time of the incident.

建议的解决方案:在部门名称查找表中设置条目时,请设置开始日期和结束日期值.使用视图,根据事件日期创建一个计算字段,以便在任何给定时间点访问正确的部门名称.

Proposed solution: When setting up an entry in the department name lookup table, set a start and end date value. Using a view, create a calculated field based on the incident date to access the correct department name at any given point in time.

专业人士:只需进行一点防御性编码,选定的用户就可以通过自助服务通过GUI来管理其静态数据,而无需进行任何其他数据库更改.可以即时进行更改,例如完全更改名称.不需要DBA支持.

Pros: With a little bit of defensive coding it would enable self service by selected users to manage their static data via a GUI without any additional database changes. On the fly changes can be done e.g. changing the name completely. No DBA support is required.

缺点:鉴于在大型数据集上进行大量查找/计算,这可能是一项昂贵的操作.

Cons: Potentially an expensive operation given the volume of lookups/calculations being done over a large dataset.

替代解决方案:只需使用并插入部门名称的纯文本值即可.此处的缺点是,临时请求更改/更新值需要DBA,这可能以特定的日期范围为目标,并且丢失了一些错误的记录.表空间消耗也会增加.

Alternative solution: Simply use and insert the plain text value of the department name. The drawbacks here would be that DBAs are needed for adhoc requests to change/update values, potentially targeting specific date ranges and missing some records in error. There would also be increased table space consumption.

:Assigned_Technician_ID

Column: Assigned_Technician_ID

场景:将为事件分配一名技术人员,该人员将存储该技术人员的ID.查找表将保存所有可用技术人员的当前"列表.当人们离开公司时,必须刷新列表,并删除过时的技术人员.这是为了使下拉菜单中的值数量保持最少.该企业仍将希望查看在其所有事件数据上分配了哪些技术人员.

Scenario: An incident will have one technician assigned where the ID of the technician will be stored. A lookup table will hold a 'current' list of all available technicians. As people leave the business the list has to be refreshed and obsolete technicians removed. This is to keep the number of values in the dropdowns to a minimum. The business will still want to see which technicians were assigned on all of their incident data.

解决方案:不是从技术人员查找表中删除条目,而是用表示已归档/已删除"的标志标记该条目.此标志将充当GUI下拉菜单上的过滤器,以删除不需要的条目.

Solution: Instead of deleting an entry from the technician lookup table, mark the entry with a flag which denotes 'archived/deleted'. This flag would act as a filter on the GUI dropdowns to remove unwanted entries.

专业人士:查找表仅包含employee表中技术人员的UID.因此,如果业务需求发生变化,将很容易在主视图中呈现技术人员的任何属性,例如:全名或员工编号,等等.

Pros: The lookup table would only consist of the technician's UID from the employee table. Therefore if business requirements change it would be easy to render any attributes of a technician in the main view e.g. full name or employee number and so on.

缺点:如前例所示,在大型数据集上查找可能是一项昂贵的操作.在GUI方面,还需要在业务逻辑和设计方面进行其他工作.具体来说,当原始条目已存档"时,如何管理下拉列表.

Cons: As in the previous example the lookups could potentially be an expensive operation on a large dataset. Additional work would be required on the GUI side in regards to business logic and design. Specifically how to manage dropdowns lists when the original entry has been 'archived'.

替代解决方案:与上面的上一示例一样,只需使用纯文本值即可.缺点是将增加表空间的消耗,而随着不断变化的业务需求而降低灵活性.

Alternative Solution: As in the previous example above, just use the plain text value. Drawbacks here would be greater consumption of table space and less flexible with changing business requirements.

推荐答案

有一种叫做版本控制的技术已经存在了很多年,但是由于多种原因在很大程度上是行不通的.但是,有一种类似的技术称为版本标准格式",我发现它非常有用.这是一个使用Employees表的示例.

There is a technique called versioning that has been around for many years but is largely unworkable for several reasons. However, there is a similar technique I call Version Normal Form which I have found to be very useful. Here is an example using an Employees table.

首先,创建静态表.这是主要的实体表,它包含有关实体的静态数据.静态数据是指在实体的生存期内不会发生变化的数据,例如出生日期.

First, the static table is created. This is the main entity table and it contains static data about the entity. Static data is data that is not expected to change during the life of the entity, such as birth date.

create table Employees(
  ID        int  auto_generated primary key,
  FirstName varchar( 32 ),
  Hiredate  date not null,
  TermDate  date,            -- last date worked
  Birthdate date,
  ...              -- other static data
);

重要的是要意识到每个员工都有一个条目,就像任何这样的表一样.

It is important to realize there is one entry for each employee, just as with any such table.

然后关联的版本表.这将与静态表建立1-m关系,因为一个雇员可能有多个版本.

Then the associated version table. This establishes a 1-m relationship with the static table as there could be several versions for an employee.

create table Employee_versions(
  ID         int   not null,
  EffDate    date  not null,
  char( 1 )  IsWorking not null default true,
  LastName   varchar( 32 ),    -- because employees can change last name
  PayRate    currency not null,
  WorkDept   int   references Depts( ID ),
  ...,              -- other changable data
  constraint PK_EmployeeV primary key( ID, EffDate )
);

在版本表注释中,有一个生效日期,但没有一个匹配的不再有效的字段.这是因为一个版本生效后,它将一直有效,直到被后续版本替换为止. ID和EffDate的组合必须唯一,因此同一员工的两个版本不能同时处于活动状态,一个版本的结束时间与下一个版本的开始时间之间也不能有间隔.

In the version table note there is an effective date but not a matching no-longer-effective field. This is because once a version takes effect, it stays in effect until replaced by the subsequent version. The combination of ID and EffDate must be unique so there cannot be two verions for the same employee that are active at the same time, nor can there be a gap between the time one version ends and when the next version starts.

大多数查询都想知道员工数据的当前版本.这是通过将雇员的静态行与现在有效的版本连接起来而提供的.可以通过以下查询找到它:

Most queries will want to know the current version of employee data. This is provided by joining the static row for the employee with the version that is in effect now. This can be found with the following query:

select  ...
from    Employees e
join    Employee_versions v1
    on  v1.ID = e.ID
    and v1.EffDate =(
        select  Max( v2.EffDate )
        from    EmployeeVersions v2
        where   v2.ID = v1.ID
            and v2.EffDate <= NOW()
    )
where  e.ID = :EmpID;

这将返回最近一次开始的一个和唯一一个版本.在日期检查(v2.EffDate <= NOW())中使用不等式< =可以考虑将来的有效日期.假设您知道新员工将从下个月的第一天开始,或者计划在下个月的13号加薪,则可以提前插入此数据.这样的预加载"条目将被忽略.

This returns the one and only one version that started in the most recent past. Using the inequality <= in the date check (v2.EffDate <= NOW()) allows for effective dates in the future. Suppose you know a new employee will start on the first day of next month or a raise in pay is scheduled for the 13th of next month, this data can inserted ahead of time. Such "preloaded" entries will be ignored.

不要让子查询找到您.所有搜索字段均已索引,因此结果非常快.

Don't let the subquery get to you. All the search fields are indexed so the result is quite fast.

此设计具有很大的灵活性.上面的查询返回当前和过去所有员工的最新数据.您可以检查TermDate字段以获取现在的员工.实际上,由于您的应用程序中有很多地方只会对当前雇员的当前信息感兴趣,因此该查询将产生很好的显示效果(省略最后的where子句).应用程序甚至不需要知道存在这样的版本.

There is a lot of flexibility with this design. The query above returns the latest data of all employees, present and past. You could check the TermDate field to get just present employees. In fact, since a good many places in your apps will only be interested in the current info of current employees, that query would make a good view (omit the final where clause). No need for the apps to even know such versions exist.

如果您有特定的日期,并且想要查看当时有效的数据,则将子查询中的v2.EffDate <= NOW()更改为v2.EffDate <= :DateOfInterest.

If you have a particular date and you want to see the data that was in effective at that time, then change the v2.EffDate <= NOW() in the subquery to v2.EffDate <= :DateOfInterest.

更多详细信息可以在幻灯片演示文稿中找到,此处,而不是-完全完成的文档此处.

More details can be found in a slide presentation here and a not-quite-completed document here.

要展示设计的一点可扩展性,请注意版本表中有一个IsWorking指示符,而静态表中有一个终止日期.当员工离开公司时,将最后日期插入到静态表中,并将IsWorking设置为false的最新版本的副本插入到版本表中.

To show off a little of the extensibility of the design, notice there is a IsWorking indicator in the version table as well as a termination date in the static table. When an employee leaves the company, the last date is inserted in the static table and a copy of the latest version with IsWorking set to false is inserted into the version table.

员工离开公司一段时间然后再次被录用是很普遍的.仅使用静态表中的日期,只需将该日期设置回NULL即可再次激活该条目.但是,如果此人不再是雇员,则在任何时候进行回头看"查询都将返回结果.没有迹象表明他们已经离开了公司.但是,版本离开公司时为IsWorking = false,而返回公司时为IsWorking = true的版本将允许在感兴趣时检查该值,并在雇员不再是雇员时忽略雇员,即使他们返回了该雇员以后.

It's fairly common for employees to leave a company for a while then get hired again. With just the date in the static table, the entry can be activated again just by setting that date back to NULL. But a "look back" query for any time when the person was no longer an employee would return a result. There would be no indication that they had left the company. But a version with IsWorking = false when leaving the company and IsWorking = true when returning to the company will allow a check of that value at the time of interest and ignore employees when they were no longer an employee even if they returned later.

这篇关于如何最好地管理数据库中的历史查找值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆