为什么 PostgreSQL 认为范围类型中的 NULL 边界与无限边界不同? [英] Why does PostgreSQL consider NULL boundaries in range types to be distinct from infinite boundaries?

查看:89
本文介绍了为什么 PostgreSQL 认为范围类型中的 NULL 边界与无限边界不同?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

只是作为序言,我不是问 NULL 边界和无限边界之间的区别是什么 - 那是 在另一个问题中介绍.相反,我问的是为什么 PostgreSQL 在 NULL 和无限边界之间进行区分,而(据我所知)它们的功能完全相同.

Just to preface, I'm not asking what the difference is between a NULL boundary and an infinite boundary - that's covered in this other question. Rather, I'm asking why PostgreSQL makes a distinction between NULL and infinite boundaries when (as far as I can tell) they function exactly the same.

我最近开始使用 PostgreSQL 的范围类型,我对范围类型中的 NULL 值应该意味着什么感到有些困惑.文档 说:

I started using PostgreSQL's range types recently, and I'm a bit confused by what NULL values in range types are supposed to mean. The documentation says:

范围的下限可以省略,这意味着所有小于上限的值都包含在范围内,例如,(,3].同样, 如果省略了范围的上限,则所有大于下限的值都包含在范围内.如果同时省略了下限和上限,则元素类型的所有值被视为在范围内.

The lower bound of a range can be omitted, meaning that all values less than the upper bound are included in the range, e.g., (,3]. Likewise, if the upper bound of the range is omitted, then all values greater than the lower bound are included in the range. If both lower and upper bounds are omitted, all values of the element type are considered to be in the range.

这向我表明,范围中省略的边界(相当于范围类型的构造函数中指定的 NULL 边界)应该被视为无限.但是,PostgreSQL 区分了 NULL 边界和无限边界.文档继续:

This suggests to me that omitted boundaries in a range (which are the equivalent NULL boundaries specified in a range type's constructor) should be considered infinite. However, PostgreSQL makes a distinction between NULL boundaries and infinite boundaries. The documentation continues:

您可以将这些缺失值 [范围内] 视为 +/-无穷大,但它们是特殊的范围类型值,被视为超出任何范围元素类型的 +/- 无穷大值.

You can think of these missing values [in a range] as +/-infinity, but they are special range type values and are considered to be beyond any range element type's +/-infinity values.

这令人费解.《超越无限》没有意义,因为无限值的全部意义在于没有可以大于+infinity或小于-infinity.这不会破坏范围内的元素"类型的检查,但它确实为主键引入了一个有趣的案例,我认为大多数人都不会想到.或者至少,没想到.

This is puzzling. "beyond infinity" doesn't make sense, as the entire point of infinite values is that nothing can be greater than +infinity or less than -infinity. That doesn't break "element in range"-type checks, but it does introduce an interesting case for primary keys that I think most people wouldn't expect. Or at least, I didn't expect it.

假设我们创建了一个基本表,它的唯一字段是日期范围,这也是 PK:

Suppose we create a basic table whose sole field is a daterange, which is also the PK:

CREATE TABLE public.range_test
(
    id daterange NOT NULL,
    PRIMARY KEY (id)
);

然后我们可以毫无问题地使用以下数据填充它:

Then we can populate it with the following data with no problem:

INSERT INTO range_test VALUES (daterange('-infinity','2021-05-21','[]'));
INSERT INTO range_test VALUES (daterange(NULL,'2021-05-21','[]'));

选择所有数据显示我们有这两个元组:

Selecting all the data reveals we have these two tuples:

[-infinity,2021-05-22)
(,2021-05-22)

所以这两个元组是不同的,否则就会发生主键冲突.但同样,当我们处理构成范围的实际元素时,NULL 边界和无限边界的工作方式完全相同.例如,没有 date 值 X 使得 X <@ [-infinity,2021-05-22) 的结果返回与 不同的结果X <@(,2021-05-22).这是有道理的,因为 NULL 值不能具有 date 类型,因此它们甚至无法与范围进行比较(并且 PostgreSQL 甚至转换了 中 NULL 下界的包含边界daterange(NULL,'2021-05-21','[]') 到一个独占边界,(,2021-05-22) 要加倍确定).但是为什么在每个实际方面都相同的两个范围被认为是不同的?

So the two tuples are distinct, or there would have been a primary key violation. But again, NULL boundaries and infinite boundaries work exactly the same when we're dealing with the actual elements that make up the range. For example, there is no date value X such that the results of X <@ [-infinity,2021-05-22) returns a different result than X <@ (,2021-05-22). This makes sense because NULL values can't have a type of date, so they can't even be compared to the range (and PostgreSQL even converted the inclusive boundary on the lower NULL bound in daterange(NULL,'2021-05-21','[]') to an exclusive boundary, (,2021-05-22) to be doubly sure). But why are two ranges that are identical in every practical way considered distinct?

还在上学的时候,我记得无意中听到一些关于未知"和未知"之间区别的讨论.和不存在"- 两个比我更聪明的人在讨论为什么 NULL 值经常导致问题以及用单独的未知"替换单数 NULL 的上下文中讨论这个问题.和不存在"价值观可能会解决这些问题,但当时的讨论超出了我的头脑.想到这个奇怪的功能让我想起了那个讨论.未知"和未知"之间的区别也是如此.和不存在"PostgreSQL 将 NULL 和 +-infinity 视为不同的原因?如果是这样,为什么范围是 PostgreSQL 中唯一允许这种区别的类型?如果不是,为什么 PostgreSQL 将功能等效的值视为不同的?

When I was still in school, I remember overhearing some discussion about the difference between "unknown" and "doesn't exist" - two people who were smarter than me were talking about that in the context of why NULL values often cause issues, and that replacing the singular NULL with separate "unknown" and "doesn't exist" values might solve those issues, but the discussion was over my head at the time. Thinking about this weird feature made me think of that discussion. So is the distinction between "unknown" and "doesn't exist" the reason why PostgreSQL treats NULL and +-infinity as distinct? If so, why are ranges the only types that allow for that distinction in PostgreSQL? And if not, why does PostgreSQL treat functionally-equivalent values as distinct?

推荐答案

相反,我问的是为什么 PostgreSQL 在(据我所知)功能完全相同的情况下区分 NULL 和无限边界.

Rather, I'm asking why PostgreSQL makes a distinction between NULL and infinite boundaries when (as far as I can tell) they function exactly the same.

但他们没有.NULL 在用作范围边界时是一种语法便利,而 -infinity/infinity 是实际的 >values 在范围的域中.抽象值意味着小于/大于任何其他值,但仍然如此(可以包括或排除).

But they do not. NULL is a syntax convenience when used as bound of a range, while -infinity / infinity are actual values in the domain of the range. Abstract values meaning lesser / greater that any other value, but values nonetheless (which can be included or excluded).

此外,NULL 适用于 any 范围类型,而大多数数据类型没有像 -infinity/ 这样的特殊值无穷大.以 integerint4range 为例.

Also, NULL works for any range type, while most data types don't have special values like -infinity / infinity. Take integer and int4range for example.

为了更好地理解,请考虑 pgsql-general 中的线程 提供a_horse:

For a better understanding, consider the thread in pgsql-general that a_horse provided:

这是有道理的,因为 NULL 值不能具有日期类型,因此它们甚至无法与范围进行比较

This makes sense because NULL values can't have a type of date, so they can't even be compared to the range

每个数据类型都可以是NULL,甚至是明确NOT NULL的域.见:

Every data type can be NULL, even domains that are explicitly NOT NULL. See:

这当然包括 date(例如 阿德里安评论):

That includes date, of course (like Adrian commented):

test=> SELECT NULL::date, pg_typeof(NULL::date);
 date | pg_typeof 
------+-----------
      | date
(1 row)

但是试图将 NULL 讨论为 value(当用作范围的界限时)是一种误导性的方法.这不是一个值.

But trying to discuss NULL as value (when used as bound of a range) is a misleading approach to begin with. It's not a value.

...(PostgreSQL 甚至将 daterange(NULL,'2021-05-21','[]') 中的 NULL 下界的包含边界转换为独占边界,(,2021-05-22) 加倍确定.

... (and PostgreSQL even converted the inclusive boundary on the lower NULL bound in daterange(NULL,'2021-05-21','[]') to an exclusive boundary, (,2021-05-22) to be doubly sure).

同样,NULL 不被视为范围域中的值.它只是作为一种方便的语法说:无界".仅此而已.

Again, NULL is not treated as value in the domain of the range. It just serves as convenient syntax to say: "unbounded". No more than that.

这篇关于为什么 PostgreSQL 认为范围类型中的 NULL 边界与无限边界不同?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆