数据库列中位标记的任何缺点? [英] Any disadvantages to bit flags in database columns?

查看:113
本文介绍了数据库列中位标记的任何缺点?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

考虑以下表格:

  CREATE TABLE user_roles(
pkey SERIAL PRIMARY KEY,
bit_id BIGINT NOT NULL,
name VARCHAR(256)NOT NULL,
);

INSERT INTO user_roles(bit_id,name)VALUES(1,'public');
INSERT INTO user_roles(bit_id,name)VALUES(2,'restricted');
INSERT INTO user_roles(bit_id,name)VALUES(4,'confidential');
INSERT INTO user_roles(bit_id,name)VALUES(8,'secret');

CREATE TABLE消息(
pkey SERIAL PRIMARY KEY,
title VARCHAR(256),
company_fk INTEGER REFERENCES compaines(pkey), - 自问题以来更新
body VARCHAR(512),
read_roles BIGINT - 位标志
);

read_roles是一个位标志,指定可以读取新闻项目的角色的一些组合。所以如果我插入一个可以通过限制和机密阅读的新闻项目,我将设置read_roles值为 2 |或者6,当我想要回到特定用户可以看到的新闻帖子时,我可以使用一个查询。

  select * from news WHERE company_fk = 2 AND(read_roles | 2!= 0)OR(read_roles | 4!= 0); 
select * from news WHERE company_fk = 2 AND read_roles = 6;

在数据库列中使用位标志的缺点是什么?我假设这个问题的答案可能是数据库特定的,所以我有兴趣了解具体数据库的缺点。



我在使用Postgres 9.1作为我的应用程序。



更新我得到了关于数据库的一点不要使用位操作的索引,这将需要一个完整的表扫描,这将吸引性能。所以我更新了这个问题,以更好地反映我的情况,数据库中的每一行都属于一个特定的公司,所以所有的查询都将包含一个包含一个company_fk的WHERE子句,它将有一个索引。



更新我现在只有6个角色,今后可能会更多。



更新角色不是相互排斥的,它们彼此继承,例如,restricted被继承分配给public的所有权限。

解决方案

如果您只有少数角色,您甚至不保存任何存储空间在 PostgreSQL (您使用)中。 整数 列使用4个字节,一个 bigint 8个字节。两者都需要根据 MAXALIGN 对齐。一个布尔列使用一个字节。实际上,您可以为一个整数列设置四个或多个布尔列。一个 bigint 的八个或更多。



您还必须考虑到 NULL 值只占用简化) www.postgresql.org/docs/current/interactive/storage-page-layout.htmlrel =nofollow> NULL位图。



个别列更容易阅读和索引。其他人已经对此已经发表了评论。



您仍然可以使用表达式索引部分索引以在某种程度上规避索引问题(不可用)。广义语句如


数据库不能在这样的查询中使用索引



这些条件是非SARGable!


不完全正确 - 也许对于某些其他RDBMS缺乏这些功能。

但是为什么要避开这些问题呢? / p>




如果这些标志相互排斥,您可以使用一个列类型为 枚举 或一个小型查找表和引用它的外键。 (有争议的更新。)






正如您所澄清的,我们在谈论6种不同的类型(可能更多) 。单独使用 boolean 列。与一个 bigint 相比,您甚至可以节省空间。在这种情况下,空间要求似乎并不重要。


Consider the following tables:

CREATE TABLE user_roles(
    pkey         SERIAL PRIMARY KEY,
    bit_id       BIGINT NOT NULL,
    name         VARCHAR(256) NOT NULL,
);

INSERT INTO user_roles (bit_id,name) VALUES (1,'public');
INSERT INTO user_roles (bit_id,name) VALUES (2,'restricted');
INSERT INTO user_roles (bit_id,name) VALUES (4,'confidential');
INSERT INTO user_roles (bit_id,name) VALUES (8,'secret');

CREATE TABLE news(
    pkey          SERIAL PRIMARY KEY,
    title         VARCHAR(256),
    company_fk    INTEGER REFERENCES compaines(pkey), -- updated since asking the question
    body          VARCHAR(512),
    read_roles    BIGINT -- bit flag 
);

read_roles is a bit flags that specifies some combination of roles that can read news items. So if I am inserting a news item that can be read by restricted and confidential I would set read_roles to have a value of 2 | 4 or 6 and when I want to get back the news posts that a particular user can see I can use a query like.

select * from news WHERE company_fk=2 AND (read_roles | 2 != 0) OR  (read_roles | 4 != 0) ; 
select * from news WHERE company_fk=2 AND read_roles = 6; 

What are disadvantages of using bit flags in database columns in general? I am assuming the answer to this question might be database specific so I am interested in learning about disadvantages with specific databases.

I am using Postgres 9.1 for my application.

UPDATE I got the bit about the database not being to use an index for bit operations which would require a full table scan which would suck for performance. So I have updated the question to reflect my situation more closely, each row in the database belongs to a specific company so all the queries will have WHERE clause that include a company_fk which will have an index on it.

UPDATE I only have 6 roles right now, possible more in the future.

UPDATE roles are not mutually exclusive and they inherit from each other, for example, restricted inherits all the permissions assigned to public.

解决方案

If you only have a handful of roles, you don't even save any storage space in PostgreSQL (which you use). An integer column uses 4 bytes, a bigint 8 bytes. Both need to be aligned according to be MAXALIGN. A boolean column uses one byte. Effectively, you can fit four or more boolean columns for one integer column. Eight or more for a bigint.

You must also take into account that NULL values only take up one bit (simplified) in the NULL bitmap.

Individual columns are much easier to read and index. Others have commented on that already.

You could still utilize indexes on expressions or partial indexes to circumvent the problem with indexes ("non-sargable") to some extent. Generalized statements like

database cannot use indexes on a query like this

or

These conditions are non-SARGable!

are not entirely true - maybe for some others RDBMS lacking these features.
But why circumvent when you can avoid the problem altogether?


If these flags were mutually exclusive, you could use one column of type enum or a small look-up table and a foreign key referencing it. (Ruled out in question update.)


As you have clarified, we are talking about 6 distinct types (maybe more). Go with individual boolean columns. You'll probably even save space compared to one bigint. Space requirement seems immaterial in this case.

这篇关于数据库列中位标记的任何缺点?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆