创建数据库表NULL的最佳实践 [英] Creating a db table NULL best practices

查看:209
本文介绍了创建数据库表NULL的最佳实践的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我有一个表,其中两个字段有时填充,在行中创建很多NULL值时,不确定什么最佳做法是处理NULL值。



是否应将这两个字段移动到单独的表,创建两个没有NULL值的表?



在这两个表中只返回一个结果,它等于我的原始表与NULL的,所以有什么意义?



似乎没有意义,



任何想法欢迎,谢谢。

解决方案


  1. 纯粹理论上,NULL应该是指未知值。所以,再次,纯粹是理论上,你应该设计你的表,当规范化,使你不需要填写NULL值意味着不适用于这一行。但是,这一点与任何实际考虑(设计,性能或查询可读性)几乎没有关系。

  2. 实际上,有一些性能注意事项。您应该在以下情况下将非常稀疏的数据归一化:




    • 缩短表格具有实质性好处/或空间方式)。 NULL占用空间,行越宽,性能越差。当表具有多行并且有许多这样的稀疏列时,这是尤其正确的。


    • 您的查询有问题的列中有一个额外的连接。 WHERE 子句。在另一方面,在某一点上,在查询中有额外的连接可能会损害查询的性能。优化器性能(至少在Sybase连接有10个以上的表时会这么做 - 从优化器运行时占用CPU资源到实际上混淆优化器选择一个非常糟糕的计划)。解决方案是避免由于归一化而导致的表过多(如,不要将您的2列分割到单独的表中),或强制查询计划。后者显然是Bad Juju。




Not sure on what the best practices are for dealing with NULL values when I have a single table where two fields are only sometimes populated creating a lot of NULL values in the rows.

Should the two fields be moved to a seperate table creating two tables with no NULL values?

A join across these two tables would just return a result that equals my original table with the NULL's, so what's the point in that?

Seems pointless to seperate them but I have been reading a bit about avoiding null's all together in the db.

Any thoughts welcome, thanks.

解决方案

  1. Purely theoretically, a NULL is supposed to mean "unknown value". So - again, purely theoretically - you should design your tables when normalized so that you don't need to fill out NULL values to mean "not applicable for this row". However, this point has pretty much no relation to any practical consideration (design, performance, or query readability).

  2. Practically, there are some performance considerations. You should normalize away very sparse data in the following cases:

    • There is material benefit from shortening the table (both IO wise and/or space wise). NULLs do take space, and the wider the rows the worse the performance. This is especially true when the table has a LOT of rows and there are many such sparse columns. For smaller table with only 2 such columns the benefits realized might not be worth the trouble of having an extra join.

    • Your queries have the column in question in the WHERE clause. IIRC, querying on a heavily NULL-ed column is rather inefficient.

    • On the other hand, at certain point, having extra joins in the query might hurt the optimizer performance (at least it does so on Sybase once your joins have 10+ tables - from taking up CPU resources when optimizer runs to actually confusing the optimizer to pick a VERY bad plan). The solution is to avoid having too-many tables due to normalization (as in, don't bother splitting your 2 columns into a separate table), or forcing the query plan. The latter is obviously Bad Juju.

这篇关于创建数据库表NULL的最佳实践的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆