db设计批评/建议 [英] db design critic /suggestions

查看:208
本文介绍了db设计批评/建议的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我重新设计了药房数据库系统,需要输入来查看新设计是否是最佳的或需要调整。

I am redesigning a pharmacy db system and need inputs to see if the new design is optimal or requires tweaking.

以下是旧系统的快照。

可以看出,药房表格存储药房信息及其地址和联系信息。药店被分组在一起用于发票目的(药学团)或销售,广告其他目的(横幅组)。发票组可以具有不同的物理地址,不同的联系信息。

As can be seen, the pharmacies table stores pharmacy information, along with its address and contact information. Pharmacies are grouped together for invoicing purposes(pharmacygroup) or for sales, advertsing other purposes (banner group). The invoice group may have a different physical address, different contact information.

这是我的新设计。我已将地址从药房和pharmacygroup表分成一个自己的表,并为联系人创建了一个新表。他们可以是技术联系人,帐户联系人,所有者联系人等,因此联系人类型表。药房和药店组可以有单独的联系信息,我想到了一个联系人表,并有一个'linktype'和'linkid'列,以指示其药房联系人或药店组联系,但我不知道这是否正确的方法。这是一个好的设计,还是会在数据检索方面代价昂贵,因为连接数?另一件我注意到的是,在旧的设计中,他们没有创建任何外键约束,虽然药房表有groupid和bannergroupid引用为pharmacygroup和bannergroup,可能节省时间进行数据检索。这是一个好方法吗?

Here's my new design. I have split the address from both the pharmacy and pharmacygroup table into a table of its own and made a new table for contacts. Their could be technical contacts, account contacts, owner contacts etc, hence the contacttypes table. The pharmacy and the pharmacygroup can have separate contact info, I thought of making a single contact table and have a 'linktype' and 'linkid' column to indicate if its a pharmacy contact or pharmacy group contact, but I am not sure if this is a right approach. Is this a good design or will it be costly in terms of data retrieval because of the number of joins?? Another thing I noted that , is in the old design , they didn't create any foreign key constraints, although the pharmacy table had groupid and bannergroupid references for pharmacygroup and bannergroup, possibly to save time for data retrieval. Is this a good approach?

推荐答案

您的设计对我看起来不错。我总是喜欢在设计步骤上花费时间重组数据,在系统投入生产之后有几个额外的连接。你永远不知道管理/销售/金融人员将要求什么样的报告,正确的关系设计会给你更多的自由。

Your design looks good to me. I always prefer to have a couple of extra joins on the design step over spending time reorganizing data after system went into production. You never know in advance what kind of reports will be requested by management/sales/financial people, and proper relational design will give you more freedom.

此外,你不能怪只有几个额外的 JOIN 用于您的性能问题。您应该随时查看:

Also, you cannot blame only a couple of extra JOINs for your performance issues. You should always look at:


  • 数据卷(和物理数据布局),

  • 交易金额和密度,

  • I / O,CPU,内存使用情况,

  • 您的RDBMS配置,

  • SQL查询质量。

  • data volumes (and physical data layout),
  • transaction amount and density,
  • I/O, CPU, memory usage,
  • your RDBMS configuration,
  • SQL queries quality.

在我看来, JOIN

对于 RI约束(引用完整性),我看到一些项目没有任何主/外键运行,以提高性能。主要原因是:我们将所有检查嵌入到应用程序中,应用程序是系统中任何更改的唯一来源。另一方面,他们同意,不知道系统是否处于一致的状态(事实上,分析显示他们不是)。

As to the RI constraints (Referential Integrity), I've seen a couple of projects that had been running without any Primary/Foreign keys for increased performance. The main excuse was: we have all checks embedded into the Application and Application is the only source of any changes in the system. On the other hand, they agreed, that it is not known, whether systems were in a consistent state (in fact, analysis showed they were not).

我总是坚持在设计状态上创建所有可能的键/约束,因为总是有一些牛仔,谁会挖入你的数据库,并调整他们似乎更好的数据。不过,您可能希望临时禁用或甚至删除批量数据操作的一些约束/索引,这也是一个官方推荐

I always stick to creating all possible keys/constraints on the design state, as there always will be some "cowboys" around, who will dig into your database and "adjust" data they seem fits better. Still, you might want to temporarily disable or even drop some constraints/indexes for the bulk data manipulations, which is also an official recommendation.

如果不确定,请创建两个测试数据库,一个具有约束,另一个没有约束。加载一些数据并比较查询性能。我认为这将是类似的。

If uncertain, create 2 test databases, one with and another without constraints. Load some data and compare query performance. I think it will be similar.

这里我对你的草图的评论,决定都是你的。

And here my comments on your sketches, decisions are all yours.


  • 您可能需要创建一个公共的联系人表,方法与地址即向目标关系添加 contact_id owner_contact_id 等列,而不是引用来自联系人表;

  • 由于您在 contacttype 表中只有一列有一个共同的联系人),最好移动唯一的字段,并避免这个表;

  • 你似乎有混合奇异/复数名称为你的表,更好地坚持一个共同的模式在这里。我个人偏好单数;

  • pharmacygroup 中,您的PK命名为 id 而所有其余的PK都遵循 table id模式,如果你在这里使用通用模式,稍后编写脚本会更容易;

  • 地址表中,您的字段带有下划线,例如 street_name ,而您在其他地方避免 _ - 考虑使其通用;

  • 引用的命名方式不同。虽然它不是那么重要,我有几个系统,我必须依赖约束的名称,所以最好使用一些模式在这里。我使用以下一个:

  • You might want to create a common contacts table the same way you did for addresses, i.e. add contact_id, owner_contact_id, etc. columns to the target relations instead of referencing relations from contacts table;
  • As you have only one column in contacttype table (and in case you'll have a common contacts), it's better to move the only field away and avoid this table;
  • You seem to have mixture of singular/plural names for your tables, better to stick to a common pattern here. I personally prefer singular;
  • In pharmacygroup your PK is named id, while all the rest PKs follow tableid pattern, it will be easier to write scripts later if you'll use a common pattern here;
  • In addresses table you have fields with underscores, like street_name, while elsewhere you avoid _ — consider making it common;
  • References are named differently. Although it is not so highly important, I do have a couple of systems where I have to rely on the constraints' names, so it's better to use some pattern here. I use the following one:


  1. 前缀 p _ f _ c _ t _ u _ i _ 用于主键,外键,检查约束,触发器,唯一索引和其他索引;


  2. 列约束/索引/触发器的名称。

  1. prefix p_, f_, c_, t_, u_ or i_ for primary, foreign keys, check constraints, triggers, unique and other indexes;
  2. name of the table;
  3. name of the column constraint/index/trigger refers to.


为什么我喜欢用单数形式命名表?因为我总是使用 _id模式命名PK,并且IMHO pharmacy_id 看起来更好然后 pharmacies_id 。我使用这种方法,因为我有一堆通用脚本,在加载到主表之前执行数据一致性检查时依赖这种模式。

Why I prefer naming tables in singular form? Because I always name PK using table_id pattern, and IMHO pharmacy_id looks better then pharmacies_id. I use this approach as I have a bunch of general-purpose scripts which relies on this pattern when performing data consistency checks prior to loading it into the main tables.

EDIT:
更多联系人信息。
您可以在所有表​​格中使用 contact_id ,使其成为主要联系人,无论这在您的应用程序中意味着什么。如果你需要更多的联系人在某些关系,那么你可以使用不同的前缀,如 owner_contact_id sales_contact_id

More on contacts. You can use contact_id in all your tables, making it a primary contact, whatever this might mean in your application. Should you need more contacts to be there for some relations, then you can go with different prefixes, like owner_contact_id, sales_contact_id, etc.

如果您希望有大量的联系人存在于某些关系中,例如 pharmacygroup 那么您可以添加一个额外的表,如下所示:

In case you expect a huge number of contacts to be there for some relations, like pharmacygroup, then you will can add an extra table like this:

CREATE TABLE pharmacygroupcontact (
    contactid     int4,
    groupid       int4,
    contact_desc  text
);

它部分复制您的初始 groupcontacts 由两个FK和描述组成。
哪种方法更好我不能告诉,因为我不知道Application是如何设计的。

It partially copies your initial groupcontacts, but consists of two FKs and a description. Which approach is better I cannot tell as I'm not aware how Application is designed.

这篇关于db设计批评/建议的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆