在Postgres中将通用位字符串与零进行比较 [英] Generic bit string comparison against zero in Postgres

查看:90
本文介绍了在Postgres中将通用位字符串与零进行比较的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有没有一种方法可以在不将位字符串宽度硬编码为0的情况下进行非零位字符串测试?

Is there a way to do a non-zero bit string test without hard-coding the bit string width of 0?

例如,假设我有两个表,用户和功能,每个都带有掩码,我要对此进行测试:

For example, suppose I have two tables, Users and Features, each with masks, I want to test this:

SELECT u.name FROM Users u, Features f
  WHERE u.mask & f.mask;

匹配隐式非零结果。但是,SQL要求 WHERE 的显式布尔结果,而不是隐式强制转换,例如:

matching implicit non-zero results. However, SQL requires an explicit boolean result for WHERE as opposed to an implicit cast, such as this:

SELECT u.name FROM Users u, Features f
  WHERE (u.mask & f.mask) != 0::BIT(2048);

我不想硬编码 2048 (或其他任何原因)。

I don't want to hardcode 2048 (or whatever) in this query for a number of reasons.

测试 expr = 0 expr> 0 会导致类型错误。奇怪的是,我可以测试 expr = 0 :: BIT(1),但这给出了错误的答案,因为Postgres并不认为所有的全零位字符串都相等。 / p>

Testing expr = 0 or expr > 0 results in a type error. Oddly, I can test expr = 0::BIT(1), but that gives the wrong answer because Postgres does not consider all all-zero bit strings to be equal.

select 0::BIT(2) > 0::BIT(1);
 ?column? 
----------
 t
(1 row)

我可以通过以下操作创建一个计算出的零:

I can create a calculated zero by doing this:

SELECT u.name FROM Users u, Features f
  WHERE (u.mask & f.mask) != (u.mask & ~u.mask);

虽然有效,但感觉像是骇人听闻的骇客。

Which works but feels like an awful hack.

有什么建议或见解吗?

结果

我对下面提供的几种选择进行了基准测试。

I benchmarked several options provided below. Thanks for the suggestions, Erwin!

基于非常大的数据集和100,000个查询,我发现以下构造导致每秒的相关查询。希望Postgres小组的人看到了这一点,并提供了通用0以加快处理速度!不幸的是,大多数通用方法似乎都需要进行字符串转换,这是非常昂贵的。

Based on a very large data set and 100,000 queries, I found the following constructs resulted in the associated queries per second. Hopefully someone from the Postgres team sees this and provides a generic 0 to speed things up! Unfortunately most generic approaches seem to incur a string conversion which is quite expensive.

Constructs                              |  Queries / s
----------------------------------------+--------------
(u.mask & f.mask) <> 0::BIT(2048)       |  158
(u.mask & f.mask) <> (u.mask # u.mask)  |  135
(u.mask & f.mask) <> (u.mask & ~u.mask) |  125
position('1' IN (u.mask & f.mask)) > 0  |   37
(u.mask & f.mask)::TEXT !~ '^0+$'       |   27


推荐答案

短位字符串



要排除按位AND(& 返回由零组成的位串,但长度可能会改变( B'000 ...'),则可以使用强制转换为整数(最多 bit(32))或 bigint (最多 bit(64)):

Short bitstring

To exclude cases where the bitwise AND (&) returns a bitstring consisting of nothing but zeros, but the length might change (B'000...'), you can use a cast to integer (up to bit(32)) or bigint (up to bit(64)):

SELECT u.name
FROM   users u
JOIN   features f ON (u.mask & f.mask)::int <> 0;

当转换为整数时,它们全部导致 0

这也排除其中任一列为 NULL 的情况。换句话说,结果必须包含至少一个 1

When cast to integer, all of them result in 0.
This also excludes cases where either of the columns is NULL. In other words, the result has to include at least one 1.

如果您的值可以大于64位,则可以强制转换为 text 并使用正则表达式进行检查:

If your values can be longer than 64 bit, you could cast to text and check with a regular expression:

ON (u.mask & f.mask)::text !~ '^0+$'

模式解释:

^ ..字符串的开头

0 + ..一个或多个'0'

$ ..字符串结尾

^ .. beginning of string
0+ .. one or more '0'
$ .. end of string

或者,作为手册会通知


以下SQL标准函数有效位字符串以及
字符串: length bit_length octet_length 位置子字符串重叠

Ergo:

ON position('1' IN (u.mask & f.mask)) > 0

这篇关于在Postgres中将通用位字符串与零进行比较的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆