数据规范化和编写查询 [英] Data normalization and writing queries

查看:112
本文介绍了数据规范化和编写查询的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是一个jr。开发人员(5个月的工作),我有一个关于数据规范化的问题。现在,据我理解,数据规范化的一般原则是创建一个RDBMS,其中数据冗余保持最小。在我的项目中,一个DB人创建了一个DB。我们有50多个表,DB中的表通常是非常分散的,即。一个表有两个或三个列,就是这样。现在,当涉及到编写sql查询时,它已经成为一个轻微的麻烦,因为每个查询涉及组合通过几个不同的表,并将它们连接在一起。我想知道这是否是数据规范化的副作用?或者这是指向别的东西吗?

I'm a jr. developer (5 months on the job), and I have a question about data normalization. Now, as I understand it, the general principle behind data normalization is to create a RDBMS where data redundancy is kept to a minimum. In my project, one of the DB people created a DB. We have 50+ tables, and the tables in the DB are usually very fragmented, ie. a table has two or three columns and that's it. Now, when it comes to writing sql queries, it has become something of a minor hassle since each query involves combing through several different tables and joining them together. I was wondering if this is a a side effect of data normalization? Or does this point to something else?

我知道,对我来说,最简单的事情就是根据我要写的查询来写表。这将创建一个有很多冗余数据的数据库,但我很好奇,如果有一个愉快的媒体?

I know that the easiest thing to do, for me, would be to write tables based off the queries I have to write. This will create a DB with a lot of redundant data, but I was curious if there is a happy medium?

正如一个postscript,我不想来就像我在想我的工作,但我真的很好奇,更多地了解这一点。我的工作环境不是最友好的,所以我不舒服与同事提出这个问题。

Just as a postscript, I don't want to come across like I'm whining about my work, but I'm genuinely curious to learn more about this. My work environment is not the most friendly so I don't feel comfortable posing this question with my colleagues. However, I would appreciate any thoughts, books, tutorials or opinions from more experienced people.

感谢。

推荐答案


数据规范化的一般原则是创建一个数据冗余保持最小的RDBMS。

general principle behind data normalization is to create a RDBMS where data redundancy is kept to a minimum.

只是部分真实。

标准化不是关于冗余。

Normalization is not about "redundancy".

这是关于更新异常。

1NF是不使用数组规则。打破1NF意味着一个行不是原子的,但集合和集合中的独立更新不会很好地工作。有锁定和缓慢。

1NF is the "don't use arrays" rules. Breaking 1NF means a row isn't atomic, but a collection and independent updates in the collection wouldn't work out well. There'd be locking and slowness.

2NF是一键规则。每行都有一个键,行中的所有内容都取决于键。在密钥的部分上没有依赖关系。有些人喜欢谈论候选键和自然键和外键;它们可能存在或者它们可能不存在。当所有属性取决于一个密钥时满足2NF。如果键是一个单列的代理键,这个正常的形式是非常满意的。

2NF is the "one key" rule. Each row has exactly one key and everything in the row depends on the key. There are no dependencies on part of the key. Some folks like to talk about candidate keys and natural keys and foreign keys; they may exist or they may not. 2NF is satisfied when all attributes depend on one key. If the key is a single-column surrogate key, this normal form is trivially satisfied.

如果违反了2NF,你有列取决于键的一部分,但不是整个密钥。如果您有一个带有(部件号,版本号)作为键的表,以及颜色和重量的属性,其中重量取决于整个键,但颜色仅取决于部件号。你有一个2NF的问题,你可以更新一些部分的颜色,但不是其他人,创建数据异常。

If 2NF is violated, you've got columns which depend on part of a key, but not the whole key. If you had a table with (Part Number, Revision Number) as a key, and attributes of color and weight, where weight depends on the whole key, but color only depends on the part number. You have a 2NF problem where you could update some part colors but not others, creating data anomalies.

3NF是唯一的关键规则。如果将派生数据放在一行中,并更改派生结果,则它与源列不匹配。如果更改源列而不更新派生值,您也有问题。是的,触发器是一个糟糕的黑客,允许3NF设计违规。这不是重点。要点只是定义3NF并显示它防止更新问题。

3NF is the "only the key" rule. If you put derived data in a row, and change the derived result, it doesn't match the source columns. If you change a source column without updating the derived value, you have a problem, too. Yes, triggers are a bad hackaround to allow 3NF design violations. That's not the point. The point is merely to define 3NF and show that it prevents an update problem.


每个查询涉及组合通过几个不同的表, 。我想知道这是否是数据规范化的副作用?

each query involves combing through several different tables and joining them together. I was wondering if this is a a side effect of data normalization?

这篇关于数据规范化和编写查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆