从数据库模型中消除可空列的选项(为了避免 SQL 的三值逻辑)? [英] Options for eliminating NULLable columns from a DB model (in order to avoid SQL's three-valued logic)?

查看:48
本文介绍了从数据库模型中消除可空列的选项(为了避免 SQL 的三值逻辑)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

不久前,我一直在阅读SQL 和关系理论 按 CJ 日期.作者因批评 SQL 的三值逻辑 (3VL) 而闻名.1)

Some while ago, I've been reading through the book SQL and Relational Theory by C. J. Date. The author is well-known for criticising SQL's three-valued logic (3VL).1)

作者对为什么应该在 SQL 中避免使用 3VL 提出了一些重点,但是他没有概述如果不允许可空列,数据库模型会是什么样子.我已经考虑了一点,并提出了以下解决方案.如果我错过了其他设计选项,我想听听他们的消息!

The author makes some strong points about why 3VL should be avoided in SQL, however he doesn't outline how a database model would look like if nullable columns weren't allowed. I've thought on this for a bit and have come up with the following solutions. If I missed other design options, I would like to hear about them!

1) Date 对 SQL 3VL 的批评反过来也受到了批评:参见 这篇论文由 Claude Rubinson 撰写(包括 CJ Date 的原始评论).

1) Date's critique of SQL's 3VL has in turn been criticized too: see this paper by Claude Rubinson (includes the original critique by C. J. Date).

示例表格:

以下表为例,我们有一个可以为空的列 (DateOfBirth):

As an example, take the following table where we have one nullable column (DateOfBirth):

#  +-------------------------------------------+
#  |                   People                  |
#  +------------+--------------+---------------+
#  |  PersonID  |  Name        |  DateOfBirth  |
#  +============+--------------+---------------+
#  |  1         |  Banana Man  |  NULL         |
#  +------------+--------------+---------------+

<小时>

选项 1:通过标志和默认值模拟 NULL:

不是使列可以为空,而是指定任何默认值(例如 1900-01-01).额外的 BOOLEAN 列将指定 DateOfBirth 中的值是否应该被忽略或它是否实际包含数据.

Instead of making the column nullable, any default value is specified (e.g. 1900-01-01). An additional BOOLEAN column will specify whether the value in DateOfBirth should simply be ignored or whether it actually contains data.

#  +------------------------------------------------------------------+
#  |                              People'                             |
#  +------------+--------------+----------------------+---------------+
#  |  PersonID  |  Name        |  IsDateOfBirthKnown  |  DateOfBirth  |
#  +============+--------------+----------------------+---------------+
#  |  1         |  Banana Man  |  FALSE               |  1900-01-01   |
#  +------------+--------------+----------------------+---------------+

<小时>

选项 2:将可为空的列转换为单独的表:

可空列被一个新表 (DatesOfBirth) 替换.如果记录没有该列的数据,则新表中将没有记录:

The nullable column is replaced by a new table (DatesOfBirth). If a record doesn't have data for that column, there won't be a record in the new table:

#  +---------------------------+ 1    0..1 +----------------------------+
#  |         People'           | <-------> |         DatesOfBirth       |
#  +------------+--------------+           +------------+---------------+
#  |  PersonID  |  Name        |           |  PersonID  |  DateOfBirth  |
#  +============+--------------+           +============+---------------+
#  |  1         |  Banana Man  |
#  +------------+--------------+

虽然这似乎是更好的解决方案,但这可能会导致需要连接多个表以进行单个查询.由于 OUTER JOIN 将不被允许(因为它们会将 NULL 引入到结果集中),所有必要的数据可能不再仅通过单个查询即可获取和以前一样.

While this seems like the better solution, this would possibly result in many tables that need to be joined for a single query. Since OUTER JOINs won't be allowed (because they would introduce NULL into the result set), all the necessary data could possibly no longer be fetched with just a single query as before.

问题:是否有其他选项可以消除 NULL(如果有,它们是什么)?

Question: Are there any other options for eliminating NULL (and if so, what are they)?

推荐答案

我看到 Date 的同事 Hugh Darwen 在一个出色的演示文稿如何在不使用 NULL 的情况下处理丢失的信息"中讨论了这个问题,该演示文稿可在 第三宣言网站.

I saw Date's colleague Hugh Darwen discuss this issue in an excellent presentation "How To Handle Missing Information Without Using NULL", which is available on the Third Manifesto website.

他的解决方案是您第二种方法的变体.这是第六范式,其中包含出生日期和未知标识符的表格:

His solution is a variant on your second approach. It's sixth normal form, with tables to hold both Date of Birth and identifiers where it is unknown:

#  +-----------------------------+ 1    0..1 +----------------------------+
#  |         People'             | <-------> |         DatesOfBirth       |
#  +------------+----------------+           +------------+---------------+
#  |  PersonID  |  Name          |           |  PersonID  |  DateOfBirth  |
#  +============+----------------+           +============+---------------+
#  |  1         |  Banana Man    |           ! 2          | 20-MAY-1991   |
#  |  2         |  Satsuma Girl  |           +------------+---------------+
#  +------------+----------------+
#                                  1    0..1 +------------+
#                                  <-------> | DobUnknown |
#                                            +------------+
#                                            |  PersonID  |
#                                            +============+
#                                            | 1          |
#                                            +------------+

从 People 中选择需要加入所有三个表,包括指示未知出生日期的样板.

Selecting from People then requires joining all three tables, including boilerplate to indicate the unknown Dates Of Birth.

当然,这有点理论化.现在的 SQL 状态仍然不够先进,无法处理所有这些.Hugh 的演讲涵盖了这些缺点.他提到的一件事并不完全正确:某些风格的 SQL 确实支持多重赋值——例如 Oracle 的 INSERT ALL 语法.

Of course, this is somewhat theoretical. The state of SQL these days is still not sufficiently advanced to handle all this. Hugh's presentation covers these shortcomings. One thing he mentions is not entirely correct: some flavours of SQL do support multiple assignment - for instance Oracle's INSERT ALL syntax.

这篇关于从数据库模型中消除可空列的选项(为了避免 SQL 的三值逻辑)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆