为什么在事实表中将NULL值映射为0? [英] Why NULL values are mapped as 0 in Fact tables?

查看:125
本文介绍了为什么在事实表中将NULL值映射为0?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在事实表(维度建模的数据仓库)的度量字段中,NULL值通常映射为0的原因是什么?

What is the reason that in measure fields in fact tables (dimensionally modeled data warehouses) NULL values are usually mapped as 0?

推荐答案

尽管您已经接受了另一个答案,但出于一些原因,我想说使用NULL实际上是一个更好的选择。

Although you've already accepted another answer, I would say that using NULL is actually a better choice, for a couple of reasons.

第一个原因是当存在NULL时,聚合返回正确答案(即用户倾向于期望的答案),而当您使用零时,给出错误答案。在以下两个查询中考虑来自AVG()的结果:

The first reason is that aggregates return the 'correct' answer (i.e. the one that users tend to expect) when NULL is present but give the 'wrong' answer when you use zero. Consider the results from AVG() in these two queries:

-- with zero; gives 1.5
select SUM(measure), AVG(measure)
from
(
select 1.0 as 'measure'
union all
select 2.0
union all
select 3.0
union all
select 0
) dt

-- with null; gives 2
select SUM(measure), AVG(measure)
from
(
select 1.0 as 'measure'
union all
select 2.0
union all
select 3.0
union all
select null
) dt

如果我们假设此处的度量是生产项目的天数,而NULL表示仍在生产的项目,则零给出错误的答案。同样的道理也适用于MIN()和MAX()。

If we assume that the measure here is "number of days to manufacture item" and NULL represents an item that is still being produced then zero gives the wrong answer. The same reasoning applies to MIN() and MAX() too.

第二个问题是,如果默认值为零,那么如何区分零和零。默认值和零作为实际值?例如,考虑欧元运输费用的度量,其中NULL表示客户自己提货,因此没有运费,零表示该订单是免费运送给客户的。在不完全改变数据含义的情况下,不能使用零代替NULL。您显然可以辩称,区别应该与其他维度(例如运输方法)相区分,但这会增加报表和理解数据的复杂性。

The second issue is that if zero is a default value, then how do you distinguish between zero as a default and zero as a real value? For example, consider a measure of "shipping charges in EUR" where NULL means that the customer picked up the order himself so there were no shipping charges and zero means the order was shipped to the customer for free. You can't use zero to replace NULL without completely changing the meaning of the data. You can obviously argue that the distinction should be clear from other dimensions (e.g. shipping method) but that adds more complexity to reports and understanding the data.

这篇关于为什么在事实表中将NULL值映射为0?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆