数据库设计一对多到多 [英] database design one-to-many-to-many

查看:98
本文介绍了数据库设计一对多到多的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在为Java Web应用程序工作流程应用程序的数据库设计,以跟踪和检查各种业务产品(文档)。但是我的设计和实施有一些不确定性。

这是我的表:




  • WORK_FLOW_CLASS - 产品类型的工作流定义表,即:报告,演示,工程计算


  • ROLE_CLASS - 工作流角色的定义表,即:发起人,检查者,审批人,经理


  • WORK_ITEM_CLASS - 表示一些表格的工作项目的定义表,即:作业表格,计算清单,报告审查表,关闭清单等。这些表格需要通过


  • WORK_ACTION_CLASS - 定义工作项动作,即:Checker(a ROLE )的完整计算清单(a $ code> WORK_ITEM ),完成审核报告($ $ c $经理(一个 ROLE ),完成审核报告(一个 WORK_ITEM ) )由审批者(a ROLE


  • WORK_FLOW_SEQUENCE - 将 WORK_FLOW_CLASS WORK_ITEM_CLASS es相关的定义表,即ABC ABC 1级计算序列是:( 1)作业单,(2)计算清单,(3)结算清单。部门XYZ 1级计算顺序为:(1)作业表格,(2)关闭清单[XYZ选择不在其工作流程中执行清单]




第一个问题: WORK_ACTION_CLASS是否有FK将其与WORK_ITEM_CLASS相关联?



或我应该用中间表来提供关系吗?我认为后者是不必要的,因为我不需要将WORK_ACTION与多个WORK_ITEMS相关联,只有一个。



当用户启动新的工作流程实例时,我的计划是查询CLASS表中所选WORK_FLOW_CLASS的详细信息,并将其实例化为下表。




  • WORK_FLOW_INSTANCE - 工作流程的实际实例 - 我想像购物车订单


  • WFI_WORK_ITEMS - WORK_ITEM的工作流程实例项目 - 我猜想订单中的产品


  • WFI_WORK_ITEM_ACTIONS - 工作流程实例工作项目操作




这里是我需要帮助的第二个问题



应该有两个单独的表 WFI_WORK_ITEMS WFI_WORK_ITEM_ACTIONS 还是应该有一个复合表?我将需要查询工作流程实例中的WORK_ITEMS以及子步骤WORK_ACTIONS的WORK_ITEMS。



这里是有问题的两个表的详细信息。对不起,我不知道描绘我的模式的最佳方法。

  WFI_WORK_ITEMS 
======== ======
WFI_WORK_ITEM_ID(PK)
WORKFLOW_INSTANCE_ID(FK)
WORK_ITEM_CLASS_ID(FK)
STEP_NUM
LAST_DATE
状态
IS_ACTIVE


WFI_WORK_ITEM_ACTION
==================
WFI_WORK_ITEM_ACTION_ID(PK)
WORKFLOW_INSTANCE_ID (FK)
WFI_WORK_ITEM_ID(FK)
WORK_ITEM_CLASS_ID(FK)
STEP_NUM
WORK_ACTION_CLASS_ID(FK)
ACTION_OWNER
LAST_DATE日期
状态
IS_ACTIVE

我似乎在这两个表中有很多冗余信息。但是我读过有时正在退化的表格。



非常感谢任何帮助我的设计。



编辑
我的意思是冗余数据是两个表都列出了 WORKFLOW_INSTANCE_ID WORK_ITEM_CLASS_ID 我的想法是,如果我想知道WORKFLOW的特定实例的WORK_ITEM_ACTIONS,我可以通过查询表 WFI_WORK_ITEM_ACTION 而不加入WFI_WORK_ITEM表。也许这是错误的思考。 WORK_ITEM_CLASS_ID 相同。通过查询 WFI_WORK_ITEM_ACTION 表,我会知道ACTION的WORK_ITEM类型。



这是一个更好的设计?

 
WFI_WORK_ITEM
==============
WFI_WORK_ITEM_ID PK)
WORKFLOW_INSTANCE_ID(FK)
WORK_ITEM_CLASS_ID(FK)
WORK_ITEM _ STEP_NUM
WORK_ITEM _ LAST_DATE
WORK_ITEM _ 状态
WORK_ITEM _ IS_ACTIVE


WFI_WORK_ITEM_ACTION
============= =======
WFI_WORK_ITEM_ACTION_ID(PK)
WORKFLOW_INSTANCE_ID(FK)
WFI_WORK_ITEM_ID(FK)
WORK_ITEM_CLASS_ID(FK)
WORK_ACTION _ STEP_NUM
WORK_ACTION_CLASS_ID(FK)
ACTION_OWNER
WORK_ACTION _ LAST_DATE
WORK_ACTION_ STATUS
WORK_ACTION _ IS_ACTIVE


解决方案

您应该始终以第三种正常形式(3NF)设计表格。只要您了解并减轻影响,但以3NF开始,则可以恢复为较小的格式(通常是出于性能原因)。



(稍微简化)规则要记住,表中的每个非关键列都应该依赖于:




  • 键,

  • 整个键,

  • ,只有键,

  • 所以帮我,Codd - 一个DBA幽默(我的意思是小)。



第一个问题是相当简单的。



一对多关系最好表示为许多表中的外键。所以你提出的是明智的。它允许您自动限制关系。如果您有一个单独的加入表(用于多对多),则需要诉诸诡计才能执行一对多关系。



关于你的第二个问题,你需要看看上面的Codd规则,并想一想:每个表中的这些行究竟是什么?如果工作项动作是与工作项不同的对象(它们可能与相关相关,但如果它们不是相同的对象,则它们是不同的),它们应该在不同的表中。另外,看起来你有一对一的关系(一个项目可以有很多动作),所以它们应该在不同的表中,因为这个原因。



关于您对冗余信息的查询:如果他们真的冗余,则应进行修复。



使用 step_num 作为示例,这代表什么?如果它是工作项目的属性,根本不应该在工作表单中。



你可以从那里删除它,如果你想知道工作操作表中一行的步号,你可以使用外键加入工作项表。



如果它是工作操作的属性,则应该将其从工作项表中删除,因为它没有任何意义。你可能会有两个动作,每一个都有不同的步号,所以在这种情况下,父项的步数是多少?



当然,你可能有一个> 这两个项目操作的区别步骤号 - 在这种情况下,我会考虑重命名以使意图清除,像 item_step_num action_step_num



底线是以3NF开头。如果您的数据库在某些时候运行得太慢,那么那么会考虑将转换为较小的形式。然后,您可以在这里询问另一个问题,了解如何识别和减轻由此产生的问题(例如,两个地方数据不一致的可能性,并使用触发器来防止)。 >

I'm working on a database design for a Java web application work flow app to track and check various business products (documents). But I have a number of uncertainties with its design and implementation.
Here are my tables:

  • WORK_FLOW_CLASS - definitions table of work flows for types of products, i.e.: Reports, Presentations, Engineering Calculations

  • ROLE_CLASS - definition table of work flow roles, i.e.: Originator, Checker, Approver, Manager

  • WORK_ITEM_CLASS - definition table of work items that represent some paper form, i.e.: Assignment Form, Calculation Checklist, Report Review Form, Closeout Checklist, etc. These forms require to be worked by multiple roles in a certain order before being considered complete

  • WORK_ACTION_CLASS - definition of work item actions, i.e.: complete Calculation Checklist (a WORK_ITEM) by Checker (a ROLE), complete Review Report (a WORK_ITEM) by Manager (a ROLE), complete Review Report (a WORK_ITEM) by Approver (a ROLE)

  • WORK_FLOW_SEQUENCE - definition table relating a WORK_FLOW_CLASS to the MANY WORK_ITEM_CLASSes, i.e. Department ABC Level 1 Calculation sequence is: (1) Assignment Form, (2) Calculation Checklist, (3) Closeout Checklist. Department XYZ Level 1 Calculation sequence is: (1) Assignment Form, (2) Closeout Checklist [XYZ chose to not do the checklist in their Calc work flow].

First question: Should the WORK_ACTION_CLASS have a FK to relate it to the WORK_ITEM_CLASS?

Or should I use an Intermediate table to provide the relationship? I think the latter would be unnecessary because I wouldn't need to relate a WORK_ACTION to multiple WORK_ITEMS, only one.

When a user starts a new Work Flow Instance my plan is to query the CLASS tables for the details of the selected WORK_FLOW_CLASS and instantiate them into the following tables.

  • WORK_FLOW_INSTANCE - Actual Instance of a work flow - I guess like a Shopping Cart Order

  • WFI_WORK_ITEMS - Work Flow Instance line items of WORK_ITEMs - I guess like the products in an Order

  • WFI_WORK_ITEM_ACTIONS - Work Flow Instance Work Item Actions

Here is where I need help with the second question.

Should there be the two separate tables WFI_WORK_ITEMS and WFI_WORK_ITEM_ACTIONS or should I have one compound table? I will need to query just the WORK_ITEMS in the work flow instance as well as the WORK_ITEMS with the sub step WORK_ACTIONS.

here is the details for the two tables in questioned. Sorry I do not know the best way to depict my schema.

WFI_WORK_ITEMS
==============
WFI_WORK_ITEM_ID (PK) 
WORKFLOW_INSTANCE_ID (FK) 
WORK_ITEM_CLASS_ID (FK)
STEP_NUM
LAST_DATE
STATUS
IS_ACTIVE


WFI_WORK_ITEM_ACTION
====================
WFI_WORK_ITEM_ACTION_ID (PK) 
WORKFLOW_INSTANCE_ID (FK)
WFI_WORK_ITEM_ID (FK) 
WORK_ITEM_CLASS_ID (FK)
STEP_NUM
WORK_ACTION_CLASS_ID (FK)
ACTION_OWNER
LAST_DATE DATE
STATUS
IS_ACTIVE

I seem to have a lot of redundant info in these two tables. But I have read sometimes de-normalizing tables is performed.

Any help with my design is greatly appreciated.

EDIT What I meant by redundant data is that both tables list the WORKFLOW_INSTANCE_ID and WORK_ITEM_CLASS_ID My thought was if I wanted to know the WORK_ITEM_ACTIONS for a particular instance of a WORKFLOW, I could get it by querying the table WFI_WORK_ITEM_ACTION without joining the 'WFI_WORK_ITEM' table. Maybe this is wrong thinking. The same for WORK_ITEM_CLASS_ID. By querying the WFI_WORK_ITEM_ACTION table, I'd know the type of WORK_ITEM the ACTION was for.

Is this a better design?

WFI_WORK_ITEM
==============
WFI_WORK_ITEM_ID (PK) 
WORKFLOW_INSTANCE_ID (FK) 
WORK_ITEM_CLASS_ID (FK)
WORK_ITEM_STEP_NUM
WORK_ITEM_LAST_DATE
WORK_ITEM_STATUS
WORK_ITEM_IS_ACTIVE


WFI_WORK_ITEM_ACTION
====================
WFI_WORK_ITEM_ACTION_ID (PK) 
WORKFLOW_INSTANCE_ID (FK)
WFI_WORK_ITEM_ID (FK) 
WORK_ITEM_CLASS_ID (FK)
WORK_ACTION_STEP_NUM
WORK_ACTION_CLASS_ID (FK)
ACTION_OWNER
WORK_ACTION_LAST_DATE
WORK_ACTION_STATUS
WORK_ACTION_IS_ACTIVE

解决方案

You should always start by designing your tables in third normal form (3NF). It's quite acceptable to revert to lesser forms (for performance reasons usually) provided you understand and mitigate the impact, but start with 3NF.

The (slightly simplified) rule to remember is that every non-key column in a table should depend on:

  • the key,
  • the whole key,
  • and nothing but the key,
  • "so help me, Codd" - a little DBA humour (and I do mean "little").

The first question is fairly simple.

One-to-many relationships are best represented as a foreign key in the "many" table. So what you propose is sensible. It allows you to restrict the relationship automatically. If you had a separate joining table (used for many-to-many), you would need to resort to "trickery" to enforce the one-to-many relationship.

As to your second question, you need to look to the "Codd" rule above and think to yourself: what exactly are these rows in each table representing? If a work item action is a distinct object from a work item (they may be related but, if they're not representing the same object, they're distinct), they should be in different tables.

In addition, it appears you have a one-to-many relationship there (one item can have many actions) so they should be in different tables for that reason alone.

As to your query on the redundant info: if they really are redundant, they should be repaired.

Using the step_num as an example, what exactly does this represent? If it's an attribute of the work item, it shouldn't be in the work action table at all.

You would get rid of it from there and, if you wanted to know the step number for a row in the work action table, you would join with the work item table using the foreign key.

If instead it's an attribute of the work action, you should remove it from the work item table since it makes no sense. You might have two actions each with a different step number so what would the step number of the parent item be in that case?

Of course, you may have a distinct step number for both items and actions - in that case, I would consider renaming to make the intent clear, something like item_step_num and action_step_num.

Bottom line is to start with 3NF. If at some point your database runs too slow, then consider reversion to a lesser form. You can then ask another question here about how to recognise and mitigate the problems that arise from that (for example, the possibility of inconsistent data in two places, and using triggers to prevent that).

这篇关于数据库设计一对多到多的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆