在建模和规范化时如何处理空值? [英] What to do with null values when modeling and normalizing?

查看:138
本文介绍了在建模和规范化时如何处理空值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是SQL(仍在学习中)的新手,我必须为场地创建一个数据库. 客户预订活动室.问题在于客户并不总是提供他们的姓名,电子邮件和电话号码.在大多数情况下,它是姓名和电子邮件或姓名和电话.这很少是全部3种,但它确实会发生. 我需要将它们分别存储在各自的属性(名称,电子邮件,电话)中.但是他们给我他们的信息的方式,我有很多空值. 这些空值怎么办?有人告诉我最好不要有空值.之后,我还需要对我的表进行规范化. 请提出任何建议.

I'm new in SQL(still learning) and I have to create a database for a venue. A client book for a room for an event. The problem is that the clients don't always provide their name, their email, and their phone number. Most of the time it's either name and email or name and phone. It's rarely all 3 but it happens. I need to store each of these in their respective attribute (name, email, phone). But the way they give me their info, I have a lot of null values. What can I do with these nulls? I've been told that it's better to not have nulls. I also need to normalize my table after that. Any suggestions please.

推荐答案

SQL根据其3VL(三值逻辑)版本特别对待NULL.归一化其他关系理论则没有.但是,我们可以将SQL设计转换为关系设计,然后再转换为关系设计. (假设此处没有重复的行.)

SQL treats NULL specially per its version of 3VL (3-valued logic). Normalization & other relational theory does not. However, we can translate SQL designs into relational designs and back. (Assume no duplicate rows here.)

规范化发生在关系上,它是根据不专门处理NULL的运算符定义的.术语"规范化"具有两个最常见的不同含义:将表放入"1NF"和更高的NF(正常形式)". NULL不会影响归一化为1NF". 归一化为较高的NF"将表替换为连接到该表的较小表.为了标准化,您可以将NULL视为可空列的域中允许使用的值,除了其SQL类型的值外.如果我们的SQL表没有NULL,则可以将它们解释为关系&. SQL联接等,如联接等.但是,如果分解组件之间共享可为空列的位置,则意识到要在SQL中重构原始列,必须对相等的或两个NULL .而且,您将不需要SQL数据库中的此类CK(候选关键字).例如,您不能将其声明为SQL PK(主键),因为这意味着UNIQUE NOT NULL.例如,涉及可空列的UNIQUE约束允许该行中具有NULL的多个行,即使每列中的行具有相同的值.例如,SQL FK中的NULL会使它们得到满足(对于每个MATCH模式以各种方式),而不会因未出现在引用表中而失败. (但是DBMS与标准SQL特有地不同.)

Normalization happens to relations and is defined in terms of operators that don't treat NULL specially. The term "normalization" has two most common distinct meanings: putting a table into "1NF" and into "higher NFs (normal forms)". NULL doesn't affect "normalization to 1NF". "Normalization to higher NFs" replaces a table by smaller tables that join back to it. For purposes of normalization you could treat NULL like a value that is allowed in the domain of a nullable column in addition to the values of its SQL type. If our SQL tables have no NULLs then we can interpret them as relations & SQL join etc as join, etc. But if you decompose where a nullable column was shared between components then realize that to reconstruct the original in SQL you have to SQL join on same-named columns being equal or both NULL. And you won't want such CKs (candidate keys) in an SQL database. Eg you can't declare it as an SQL PK (primary key) because that means UNIQUE NOT NULL. Eg a UNIQUE constraint involving a nullable column allows multiple rows that have a NULL in that column, even if the rows have the same values in every column. Eg NULLs in SQL FKs cause them to be satisfied (in various ways per MATCH mode), not to fail from not appearing in the referenced table. (But DBMSs idiosyncratically differ from standard SQL.)

不幸的是,分解可能导致带有所有 CK的表包含NULL,因此我们没有什么要声明为SQL PK或UNIQUE NOT NULL.唯一确定的解决方案是转换为无NULL设计.然后进行规范化之后,我们可能想在组件中重新引入一些可空性.

Unfortunately decomposition might lead to table with all CKs containing NULL, so that we have nothing to declare as SQL PK or UNIQUE NOT NULL. The only sure solution is to convert to a NULL-free design. After then normalizing we might want to reintroduce some nullability in the components.

在实践中,我们设法设计表,以便始终存在一组可以通过SQL PK或UNIQUE NOT NULL声明为CK的无空列.我们可以通过将其从表中删除并添加具有该列和某些无空CK的列的表来摆脱并非所有无空CK的可空列:旧设计中的行,然后将其CK子行和列值的行放入添加的表中;否则,在旧设计中为NULL,并且添加的表中没有对应的行.当然,我们还必须将查询从旧设计修改为新设计.

In practice, we manage to design tables so that there is always a set of NULL-free columns that we can declare as CK, via SQL PK or UNIQUE NOT NULL. We can get rid of a nullable column that isn't in all NULL-free CKs by dropping it from the table and adding a table with that column and the columns of some NULL-free CK: If the column is non-NULL for a row in the old design then a row with its CK subrow and column value go in the added table; otherwise it is NULL in the old design and no corresponding row is in the added table. Of course, we also have to modify queries from the old design to the new design.

我们总是可以通过添加标记列的设计来避免空值,该标记列向行说明以前的可空列在旧设计中是否为NULL,如果是这样,则使该列成为为此目的我们选择的某个值在整个数据库中输入.当然,我们还必须将查询从旧设计修改为新设计.

We can always avoid nulls via a design that adds a flag column that says for a row whether a previously nullable column is NULL in the old design and if so having that column be some one value that we pick for that purpose for that type throughout the database. Of course, we also have to modify queries from the old design to the new design.

是否要避免NULL是一个单独的问题.对于这两种设计的应用程序,您的数据库在某种程度上可能会更好"或更差".避免NULL的想法是它使查询的含义复杂化,因此与来自更多无NULL表的更多联接的复杂性. (通常通过删除查询表达式中的NULL使其尽可能接近它们来管理这种异常.)

Whether you want to avoid NULL is a separate question. Your database might in some way be "better" or "worse" for your application with either design. The idea behind avoiding NULL is that it complicates the meanings of queries, hence complicates querying, in a perverse way, compared to the complication of more joins from more NULL-free tables. (That perversity is typically managed by removing NULLs in query expressions as close to where they appear as possible.)

PS许多SQL术语,包括PK& FK与关系术语不同. SQL PK的含义更像是超键; SQL FK的含义更像是外键. 但在SQL中谈论超级键"甚至没有意义:

PS Many SQL terms including PK & FK differ from the relational terms. SQL PK means something more like superkey; SQL FK means something more like foreign superkey; but it doesn't even make sense to talk about a "superkey" in SQL:

由于SQL表与关系的相似性,涉及关系的术语被随意地应用于表.但是,尽管您可以借用术语并赋予它们SQL含义(值,表,超键,CK,PK,FK,join和predicate,NF,规范化等),但您不能仅用这些SQL含义来代替这些单词RM的定义,定理或算法会带来一些理智或真实的发现.此外,RM概念几乎从不的SQL表示实际上告诉您如何正确地将RM概念应用于SQL数据库.他们只是模仿RM演示文稿,而忽略了对术语使用SQL含义是否会使事情变得毫无意义或无效.

Because of the resemblance of SQL tables to relations, terms that involve relations get sloppily applied to tables. But although you can borrow terms and give them SQL meanings (value, table, superkey, CK, PK, FK, join, and, predicate, NF, normalize, etc), you can't just substitute those SQL meanings for those words in RM definitions, theorems or algorithms and get something sensible or true. Moreover SQL presentations of RM notions almost never actually tell you how to soundly apply RM notions to an SQL database. They just parrot RM presentations, oblivious to whether their use of SQL meanings for terms makes things nonsensical or invalid.

这篇关于在建模和规范化时如何处理空值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆