位掩码的整数和bit(n)数据类型之间有什么区别吗? [英] Is there any difference between integer and bit(n) data types for a bitmask?
问题描述
我正在使用PostgreSQL数据库中的表,该表具有几个确定某些状态的布尔列(例如已发布
,可见
等)。我想创建一个状态列,该列将以位掩码的形式存储所有这些值以及可能的新值。在这种情况下,整数
和 bit(n)
有什么区别?
I am working with a table in a PostgreSQL database that has several boolean columns that determine some state (e.g. published
, visible
, etc.). I want to make a single status column that will store all these values as well as possible new ones in a form of a bitmask. Is there any difference between integer
and bit(n)
in this case?
这将是一个相当大的表,因为它存储了用户通过Web界面创建的对象。因此,我认为我将不得不为此列使用(部分)索引。
This is going to be a rather big table, because it stores objects that users create via a web-interface. So I think I will have to use (partial) indexes for this column.
推荐答案
如果您只有变量,我会考虑保留单独的 boolean
列。
If you only have a few variables I would consider keeping separate boolean
columns.
- 索引很容易。特别是表达式的索引和局部索引。
- 查询条件很简单写和读有意义。
- 布尔值列占用1个字节(无对齐填充)。对于仅有的几个变量,其占用的空间最少。
- 与其他选项不同,
布尔值
列允许如果需要,请为各个位设置NULL
值。如果没有,您总是可以定义列NOT NULL
。
- Indexing is easy. In particular also indexes on expressions and partial indexes.
- Conditions for queries are easy to write and read and meaningful.
- A boolean column occupies 1 byte (no alignment padding). For only a few variables this occupies the least space.
- Unlike other options
boolean
columns allowNULL
values for individual bits if you should need that. You can always define columnsNOT NULL
if you don't.
如果您有多个完整的变量,但不超过32 , 整数
列可能效果最佳。 (或者对于最多64个变量,使用 bigint
。)
If you have more than a hand full variables but no more than 32, an integer
column may serve best. (Or a bigint
for up to 64 variables.)
- 在磁盘上占用4个字节(可能需要对齐填充,具体取决于前面的列)。
- 对于完全匹配的索引非常快(
=
运算符。) - 处理单个值可能比使用
varbit
或boolean
。
- Occupies 4 bytes on disk (may require alignment padding, depending on preceding columns).
- Very fast indexing for exact matches (
=
operator). - Handling individual values may be slower / less convenient than with
varbit
orboolean
.
具有更多的变量,或者如果您想大量操作值,或者如果没有没有巨大的表或磁盘空间/ RAM不是问题,或者如果您不确定选择什么,我会考虑 位(n)
或 位变化(n)
(简称: varbit(n)
。
With even more variables, or if you want to manipulate the values a lot, or if you don't have huge tables or disk space / RAM is not an issue, or if you are not sure what to pick, I would consider bit(n)
or bit varying(n)
(short: varbit(n)
.
- 至少占用5个字节(对于长字符串,则为8个字节),每组8位(向上取整)另加1个字节。
- 您可以直接直接位字符串函数和运算符,而一些标准SQL函数。
- Occupies at least 5 bytes (or 8 for very long strings) plus 1 byte for each group of 8 bits (rounded up).
- You can use bit string functions and operators directly, and some standard SQL functions as well.
对于仅 3位信息,单个 boolean
列使用3个字节, 整数
需要4个字节(可能需要额外的对齐填充)和位字符串
6个字节(5 + 1)。
For just 3 bits of information, individual boolean
columns get by with 3 bytes, an integer
needs 4 bytes (maybe additional alignment padding) and a bit string
6 bytes (5 + 1).
对于 32位信息,整数
仍需要4个字节(+填充),一个位字符串
占用9个字节(相同(5 + 4)),而 boolean
列占用32个字节。
For 32 bits of information, an integer
still needs 4 bytes (+ padding), a bit string
occupies 9 bytes for the same (5 + 4) and boolean
columns occupy 32 bytes.
要进一步优化磁盘空间,您需要了解PostgreSQL的存储机制,尤其是数据对齐。 更多相关答案。
To optimize disk space further you need to understand the storage mechanisms of PostgreSQL, especially data alignment. More in this related answer.
此答案关于如何转换类型 布尔,位(n) 和 integer 可能也有帮助。
This answer on how to transform the types boolean, bit(n) and integer may be of help, too.
这篇关于位掩码的整数和bit(n)数据类型之间有什么区别吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!