PostgreSQL可以对数组元素有唯一性约束吗? [英] Can PostgreSQL have a uniqueness constraint on array elements?

查看:303
本文介绍了PostgreSQL可以对数组元素有唯一性约束吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试为目前在LDAP商店中的主机数据提供PostgreSQL架构。该数据的一部分是机器可以拥有的主机名列表,该属性通常是大多数人用来查找主机记录的关键。

I'm trying to come up with a PostgreSQL schema for host data that's currently in an LDAP store. Part of that data is the list of hostnames a machine can have, and that attribute is generally the key that most people use to find the host records.

d喜欢摆脱这个数据到RDBMS是能够在主机名列上设置唯一性约束,以便不能分配重复的主机名。如果主机只能有一个名字,那么这很容易,但是由于它们可以有一个以上的名称,所以更复杂一些。

One thing I'd like to get out of moving this data to an RDBMS is the ability to set a uniqueness constraint on the hostname column so that duplicate hostnames can't be assigned. This would be easy if hosts could only have one name, but since they can have more than one it's more complicated.

我意识到完全正常化的方式来做到这一点将有一个主机名表与一个外键指向主机表,但我想避免每个人都需要做连接,即使是最简单的查询:

I realize that the fully-normalized way to do this would be to have a hostnames table with a foreign key pointing back to the hosts table, but I'd like to avoid having everybody need to do joins for even the simplest query:

select hostnames.name,hosts.*
  from hostnames,hosts
 where hostnames.name = 'foobar'
   and hostnames.host_id = hosts.id;

我认为使用PostgreSQL数组可以为此工作,他们肯定使简单的查询变得简单: p>

I figured using PostgreSQL arrays could work for this, and they certainly make the simple queries simple:

select * from hosts where names @> '{foobar}';

当我在hostnames属性上设置唯一性约束时,它当然会将整个列表名称作为唯一值而不是每个名称。有没有办法让每一个名字都是独一无二的?

When I set a uniqueness constraint on the hostnames attribute, though, it of course treats the entire list of names as the unique value instead of each name. Is there a way to make each name unique across every row instead?

如果没有,有没有人知道另外一种更有意义的数据建模方法? >

If not, does anyone know of another data-modeling approach that would make more sense?

推荐答案

正确的路径



您可能需要重新考虑 / strong>你的模式。每个人都不需要加入即使是最简单的查询。创建 VIEW

表格可能如下所示:

CREATE TABLE hostname (
 hostname_id serial PRIMARY KEY
,host_id     int    REFERENCES host(host_id) ON UPDATE CASCADE ON DELETE CASCADE
,hostname    text   UNIQUE
);

代理主键 hostname_id 可选。我喜欢有一个。在您的情况下,您可以将主要密钥 hostname 但是,使用简单的整数键,许多操作都会更快。创建外键约束链接到表主机

创建如下视图:

The surrogate primary key hostname_id is optional. I prefer to have one. In your case hostname could be the primary key. But many operations are faster with a simple, small integer key. Create a foreign key constraint to link to the table host.
Create a view like this:

CREATE VIEW v_host AS
SELECT h.*
      ,array_agg(hn.hostname) AS hostnames
--    ,string_agg(hn.hostname, ', ') AS hostnames  -- text instead of array
FROM   host h
JOIN   hostname hn USING (host_id)
GROUP  BY h.host_id;   -- works in v9.1+

从pg 9.1 开始, GROUP BY 中的主键覆盖 SELECT 列表中该表的所有列。 版本9.1的发行说明

Starting with pg 9.1, the primary key in the GROUP BY covers all columns of that table in the SELECT list. The release notes for version 9.1:


当主
键在查询目标列表中允许非 - GROUP BY 列在 GROUP BY 子句中指定

查询可以使用视图表。通过这种方式搜索主机名将更加

Queries can use the view like a table. Searching for a hostname will be much faster this way:

SELECT *
FROM   host h
JOIN   hostname hn USING (host_id)
WHERE  hn.hostname = 'foobar';

如果您在 host(host_id),应该是这样,因为它应该是主键。另外, hostname(hostname)上的 UNIQUE 约束自动实现其他需要的索引。

Provided you have an index on host(host_id), which should be the case as it should be the primary key. Plus, the UNIQUE constraint on hostname(hostname) implements the other needed index automatically.

在Postgres 9.2 + 中,多列索引将会更好,如果您可以获得 仅索引扫描

In Postgres 9.2+ a multicolumn index would be even better if you can get an index-only scan out of it:

CREATE INDEX hn_multi_idx ON hostname (hostname, host_id)

从Postgres开始 9.3 ,您可以使用 MATERIALIZED VIEW ,情况允许。特别是如果你阅读的次数比写在桌子上的频率要多得多。

Starting with Postgres 9.3, you could use a MATERIALIZED VIEW, circumstances permitting. Especially if you read much more often than you write to the table.

如果我不能说服你的正义之路,我也会在黑暗的一面协助。我很灵活:)

If I can't convince you of the righteous path, I'll assist on the dark side, too. I am flexible. :)

这是演示如何强制主机名的唯一性。我使用一个表 hostname 来收集主机名和表主机之间的触发器来保持最新。唯一的违规行为引发错误并中止操作。

Here is a demo how to enforce uniqueness of hostnames. I use a table hostname to collect hostnames and a trigger on the table host to keep it up to date. Unique violations raise an error and abort the operation.

CREATE TABLE host(hostnames text[]);
CREATE TABLE hostname(hostname text PRIMARY KEY);  --  pk enforces uniqueness

触发功能

CREATE OR REPLACE FUNCTION trg_host_insupdelbef()
  RETURNS trigger AS
$func$
BEGIN
-- split UPDATE into DELETE & INSERT
IF TG_OP = 'UPDATE' THEN
   IF OLD.hostnames IS DISTINCT FROM NEW.hostnames THEN  -- keep going
   ELSE RETURN NEW;  -- exit, nothing to do
   END IF;
END IF;

IF TG_OP IN ('DELETE', 'UPDATE') THEN
   DELETE FROM hostname h
   USING  unnest(OLD.hostnames) d(x)
   WHERE  h.hostname = d.x;

   IF TG_OP = 'DELETE' THEN RETURN OLD;  -- exit, we are done
   END IF;
END IF;

-- control only reaches here for INSERT or UPDATE (with actual changes)
INSERT INTO hostname(hostname)
SELECT h
FROM   unnest(NEW.hostnames) h;

RETURN NEW;
END
$func$ LANGUAGE plpgsql;

触发器:

CREATE TRIGGER host_insupdelbef
BEFORE INSERT OR DELETE OR UPDATE OF hostnames ON host
FOR EACH ROW EXECUTE PROCEDURE trg_host_insupdelbef();

SQL Fiddle ,并带有测试运行。

SQL Fiddle with test run.

使用 GIN索引在数组列 host.hostnames 数组运算符 使用它:

Use a GIN index on the array column host.hostnames and array operators to work with it:

  • Why isn't my PostgreSQL array index getting used (Rails 4)?
  • Check if any of a given array of values are present in a Postgres array

这篇关于PostgreSQL可以对数组元素有唯一性约束吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆