标准使用'Z'而不是NULL表示丢失的数据? [英] Standard use of 'Z' instead of NULL to represent missing data?

查看:173
本文介绍了标准使用'Z'而不是NULL表示丢失的数据?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

除了是否应该使用NULL之外,我负责使用NULL表示丢失或从未输入数据的现有数据库。它与空字符串不同,意思是用户设置此值,并选择空。



该项目的另一个承包商坚定地在NULL对我来说不存在;我从来没有使用NULL,没有人应该,参数的一边。然而,令我感到困惑的是,由于承包商的团队承认丢失/从未进入和有意空白或由用户指示为未知之间的区别,所以他们在其代码和存储过程中使用单个字符Z在数据库的其余部分中表示缺少/从未输入与NULL相同的含义。



虽然我们的共享客户已经要求更改,支持这一要求,该团队将其列为比我先进的DBA中的标准实践他们不愿意根据我的无知请求单独使用NULL。那么任何人都可以帮我克服我的无知吗?在SQL专家中,是否有任何标准或小群体,甚至是单独的大声,倡导使用Z代替NULL?



更新



我有一个来自承包商的回复来添加。以下是当客户要求删除特殊值以允许在没有数据的列中为NULL时所说的内容:


基本上,我设计了数据库以尽可能避免NULL。以下是基本原理:



•字符串[VARCHAR]字段中的NULL不是必需的,因为空(零长度)字符串提供完全相同的信息。



整数字段中的空值(例如,ID值)可以通过使用一个值数据中永远不会出现(例如,整数IDENTITY字段为-1)。



•在日期字段中的NULL可以轻松导致并发症在日期计算。例如,在计算日期差异(例如[RecoveryDate]和[OnsetDate]之间的天数差异)的逻辑中,如果一个或两个日期都为NULL,则逻辑将会炸毁 - 除非对两个日期进行明确的限制为空这是额外的工作和额外的处理。如果[RecoveryDate]和[OnsetDate](例如1/1/1900)使用默认或占位符日期,则数学计算可能会显示异常值,但日期逻辑不会炸毁。 / em>



空处理传统上是开发人员在存储过程中出错的领域。



在我作为DBA的15年中,我发现尽可能避免NULL。


这似乎验证了对这个问题的消极反应。使用接受的6NF方法来设计NULL,而不是使用特殊值来尽可能避免NULL。我以开放的心态发布了这个问题,很高兴我更多地了解了NULL是有用的/ NULL是邪恶的辩论,但我现在很舒服地将特殊值的方法标记为完全废话。


空(零长度)字符串提供完全相同的信息。


不,没有;在现有的数据库中,我们正在修改,NULL表示从未输入,空字符串表示输入为空。


是一个开发人员在存储过程中犯错误的领域。


是的,但是这些错误已经成千上万的开发人员成千上万,为避免这些错误而提供的教训和注意事项是已知和记录的。如上所述:无论您是接受还是拒绝NULL,表示缺失值都是解决问题的一个问题。没有必要发明一个新的解决方案,只因为开发人员继续使易于克服(易于识别)的错误。






作为脚注:我已经是DBE和开发人员20多年了(这当然足够让我知道数据库工程师和数据库管理员之间的区别)。在我的职业生涯中,我一直都是在无用的营地,虽然我知道几个很聪明的人不同意。我非常怀疑特殊价值观的做法,但是在如何避免正当的方式的学者中,却不够精通,才能做出坚定的立场。我总是喜欢学习新事物,20年后我还要学习很多东西。感谢所有为此做出贡献的人都有用的讨论。


解决方案

b
$ b

好的,认真的,这不是标准做法。这可以简单地看出,因为我曾经使用的所有RDBMS实现NULL,NULL的逻辑,在外键中考虑NULL,在COUNT等等中对NULL有不同的行为。



我实际上会认为使用Z或任何其他持有人更糟糕。您仍然需要代码检查'Z'。但是您也需要记录Z并不意味着Z,这意味着别的东西。您必须确保阅读此类文档。那么如果Z成为有效的数据,会发生什么? (例如一个初始的字段)



在基本层面上,即使没有辩论NULL对'Z'的有效性,我也会坚持承包商符合公司内存在的标准做法,而不是他的。在具有替代标准做法的环境中制定他的标准做法会导致混乱,维护间接费用,误解,最终会增加成本和错误。






EDIT



在我看来,有些情况下使用NULL替代方法是有效的。但是,只有这样做才能减少代码,而不是创建需要计费的特殊情况。



例如,我已经使用了日期绑定数据。如果数据在开始日期和结束日期之间有效,则可以通过不具有NULL值来简化代码。相反,NULL开始日期可以用'01 Jan 1900'替换,NULL结束日期可以替换为'31 Dec 2079'。



这仍然可以更改




  • WHERE end-date IS NULL 不再提供仍然有效的数据

  • 您刚刚创建了自己的千年虫

  • 等。 b $ b


这相当于重构抽象,所以所有属性总是可以有效的值。它明显不同于将具体含义隐含地编码为任意选择的值。



仍然是承包商。


Outside of the argument of whether or not NULLs should ever be used: I am responsible for an existing database that uses NULL to mean "missing or never entered" data. It is different from empty string, which means "a user set this value, and they selected 'empty'."

Another contractor on the project is firmly on the "NULLs do not exist for me; I never use NULL and nobody else should, either" side of the argument. However, what confuses me is that since the contractor's team DOES acknowledge the difference between "missing/never entered" and "intentionally empty or indicated by the user as unknown," they use a single character 'Z' throughout their code and stored procedures to represent "missing/never entered" with the same meaning as NULL throughout the rest of the database.

Although our shared customer has asked for this to be changed, and I have supported this request, the team cites this as "standard practice" among DBAs far more advanced than I; they are reluctant to change to use NULLs based on my ignorant request alone. So, can anyone help me overcome my ignorance? Is there any standard, or small group of individuals, or even a single loud voice among SQL experts which advocates the use of 'Z' in place of NULL?

Update

I have a response from the contractor to add. Here's what he said when the customer asked for the special values to be removed to allow NULL in columns with no data:

Basically, I designed the database to avoid NULLs whenever possible. Here is the rationale:

A NULL in a string [VARCHAR] field is never necessary because an empty (zero-length) string furnishes exactly the same information.

A NULL in an integer field (e.g., an ID value) can be handled by using a value that would never occur in the data (e.g, -1 for an integer IDENTITY field).

A NULL in a date field can easily cause complications in date calculations. For example, in logic that computes date differences, such as the difference in days between a [RecoveryDate] and an [OnsetDate], the logic will blow up if one or both dates are NULL -- unless an explicit allowance is made for both dates being NULL. That's extra work and extra handling. If "default" or "placeholder" dates are used for [RecoveryDate] and [OnsetDate] (e.g., "1/1/1900") , mathematical calculations might show "unusual" values -- but date logic will not blow up.

NULL handling has traditionally been an area where developers make mistakes in stored procedures.

In my 15 years as a DBA, I've found it best to avoid NULLs wherever possible.

This seems to validate the mostly negative reaction to this question. Instead of applying an accepted 6NF approach to designing out NULLs, special values are used to "avoid NULLs wherever possible." I posted this question with an open mind, and I am glad I learned more about the "NULLs are useful / NULLs are evil" debate, but I am now quite comfortable labeling the 'special values' approach to be complete nonsense.

an empty (zero-length) string furnishes exactly the same information.

No, it doesn't; in the existing database we are modifying, NULL means "never entered" and empty string means "entered as empty".

NULL handling has traditionally been an area where developers make mistakes in stored procedures.

Yes, but those mistakes have been made thousands of times by thousands of developers, and the lessons and caveats for avoiding those mistakes are known and documented. As has been mentioned here: whether you accept or reject NULLs, representation of missing values is a solved problem. There is no need to invent a new solution just because developers continue make easy-to-overcome (and easy-to-identify) mistakes.


As a footnote: I have been a DBE and developer for more than 20 years (which is certainly enough time for me to know the difference beetween a database engineer and a database administrator). Throughout my career I have always been in the "NULLs are useful" camp, though I was aware that several very smart people disagreed. I was extremely skeptical about the "special values" approach, but not well-versed enough in the academics of "How To Avoid NULL the Right Way" to make a firm stand. I always love learning new things—and I still have lots to learn after 20 years. Thanks to all who contributed to make this a useful discussion.

解决方案

Sack your contractor.

Okay, seriously, this isn't standard practice. This can be seen simply because all RDBMS that I have ever worked with implement NULL, logic for NULL, take account of NULL in foreign keys, have different behaviour for NULL in COUNT, etc, etc.

I would actually contend that using 'Z' or any other place holder is worse. You still require code to check for 'Z'. But you also need to document that 'Z' doesn't mean 'Z', it means something else. And you have to ensure that such documentation is read. And then what happens if 'Z' ever becomes a valid piece of data? (Such as a field for an initial?)

At a basic level, even without debating the validity of NULL vs 'Z', I would insist that the contractor conforms to standard practices that exist within your company, not his. Instituting his standard practice in an environment with an alternative standard practice will cause confusion, maintenance overheads, mis-understanding, and in the end increased costs and mistakes.


EDIT

There are cases where using an alternative to NULL is valid in my opinion. But only where doing so reduces code, rather than creating special cases which require accounting for.

I've used that for date bound data, for example. If data is valid between a start-date and an end-date, code can be simplified by not having NULL values. Instead a NULL start-date could be replaced with '01 Jan 1900' and a NULL end-date could be replaced with '31 Dec 2079'.

This still can change behaviour from what may be expected, and so should be used with care:

  • WHERE end-date IS NULL no longer give data that is still valid
  • You just created your own millennium bug
  • etc.

This is equivalent to reforming abstractions such that all properties can always have valid values. It is markedly different from implicitly encoding specific meaning into arbitrarily chosen values.

Still, sack the contractor.

这篇关于标准使用'Z'而不是NULL表示丢失的数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆