PostgreSQL的可以对数组元素的唯一性约束? [英] Can PostgreSQL have a uniqueness constraint on array elements?

查看:449
本文介绍了PostgreSQL的可以对数组元素的唯一性约束?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想拿出主机数据的PostgreSQL模式这是目前在LDAP存储。该数据的一部分是主机名的计算机可以具有的列表,以及属性通常是大多数人用它来查找主机记录的关键。

I'm trying to come up with a PostgreSQL schema for host data that's currently in an LDAP store. Part of that data is the list of hostnames a machine can have, and that attribute is generally the key that most people use to find the host records.

有一件事我想走出这个数据移动到一个RDBMS的是设置在主机列唯一性约束,这样重复的主机名不能分配的能力。这将是容易,如果主机只能有一个名称,但是因为他们可以有一个以上的是更为复杂的。

One thing I'd like to get out of moving this data to an RDBMS is the ability to set a uniqueness constraint on the hostname column so that duplicate hostnames can't be assigned. This would be easy if hosts could only have one name, but since they can have more than one it's more complicated.

我意识到,完全标准化的方式来做到这将是有一个外键指回主机表中的主机名的表,但我想,以避免大家需要做的加入,即使是最简单的查询

I realize that the fully-normalized way to do this would be to have a hostnames table with a foreign key pointing back to the hosts table, but I'd like to avoid having everybody need to do joins for even the simplest query:

select hostnames.name,hosts.*
  from hostnames,hosts
 where hostnames.name = 'foobar'
   and hostnames.host_id = hosts.id;

我想使用PostgreSQL阵列能为这方面的工作,他们肯定做简单的查询简单:

I figured using PostgreSQL arrays could work for this, and they certainly make the simple queries simple:

select * from hosts where names @> '{foobar}';

当我设置了唯一性约束的主机名属性,但是,它当然把名字作为唯一的价值,而不是每个名称的完整列表。有没有一种方法,使整个每一行唯一的每个名称呢?

When I set a uniqueness constraint on the hostnames attribute, though, it of course treats the entire list of names as the unique value instead of each name. Is there a way to make each name unique across every row instead?

如果没有,没有任何人知道的另一个数据建模方法,会更有意义?

If not, does anyone know of another data-modeling approach that would make more sense?

推荐答案

您可能要重新考虑的正火您的架构。给大家加盟即使是最简单的查询这是没有必要的。创建一个 查看

The righteous path

You might want to reconsider normalizing your schema. It is not necessary for everyone to "join for even the simplest query". Create a VIEW for that.

表可能看起来像这样:

CREATE TABLE hostname (
 hostname_id serial PRIMARY KEY
,host_id     int    REFERENCES host(host_id) ON UPDATE CASCADE ON DELETE CASCADE
,hostname    text   UNIQUE
);

该代理主键 hostname_id 可选的。我preFER有一个。在你的情况主机名可能是主键。但很多操作都用一个简单的,小的整数键快。创建一个外键约束链接表主机。结果
创建这样一个观点:

The surrogate primary key hostname_id is optional. I prefer to have one. In your case hostname could be the primary key. But many operations are faster with a simple, small integer key. Create a foreign key constraint to link to the table host.
Create a view like this:

CREATE VIEW v_host AS
SELECT h.*
      ,array_agg(hn.hostname) AS hostnames
--    ,string_agg(hn.hostname, ', ') AS hostnames  -- text instead of array
FROM   host h
JOIN   hostname hn USING (host_id)
GROUP  BY h.host_id;   -- works in v9.1+

与PG的 9.1 开始,在 GROUP在主键BY 覆盖了该表中的所有列 SELECT 列表。该发行说明版本9.1

允许非 - GROUP BY 查询目标列表中的列在主
  关键是在 GROUP BY指定条款

Allow non-GROUP BY columns in the query target list when the primary key is specified in the GROUP BY clause

查询可以使用类似的表视图。寻找一个主机将更快的是这样的:

Queries can use the view like a table. Searching for a hostname will be much faster this way:

SELECT *
FROM   host h
JOIN   hostname hn USING (host_id)
WHERE  hn.hostname = 'foobar';

只要你有索引的主机(HOST_ID),这应该是理所应当的主键的情况。另外,在主机名 UNIQUE 约束(主机名)自动实施其他必要的索引。

Provided you have an index on host(host_id), which should be the case as it should be the primary key. Plus, the UNIQUE constraint on hostname(hostname) implements the other needed index automatically.

在Postgres里的 9.2 + 多列索引甚至会更好,如果你能得到的 仅索引扫描 出来的:

In Postgres 9.2+ a multicolumn index would be even better if you can get an index-only scan out of it:

CREATE INDEX hn_multi_idx ON hostname (hostname, host_id)

在Postgres的 9.3 ,你可以使用的 MATERIALIZED VIEW ,情况许可下。特别是如果你读得多往往比你写表。

Starting with Postgres 9.3, you could use a MATERIALIZED VIEW, circumstances permitting. Especially if you read much more often than you write to the table.

如果我无法说服你的正道,我会帮助在黑暗的一面,太。我很灵活。 :)

If I can't convince you of the righteous path, I'll assist on the dark side, too. I am flexible. :)

下面是一个演示如何执行主机名的唯一性。我使用一个表主机名来收集表主机主机名和一个触发器来保持最新。独特的违法行为引发错误并中止运行。

Here is a demo how to enforce uniqueness of hostnames. I use a table hostname to collect hostnames and a trigger on the table host to keep it up to date. Unique violations raise an error and abort the operation.

CREATE TABLE host(hostnames text[]);
CREATE TABLE hostname(hostname text PRIMARY KEY);  --  pk enforces uniqueness

触发功能

CREATE OR REPLACE FUNCTION trg_host_insupdelbef()
  RETURNS trigger AS
$func$
BEGIN
-- split UPDATE into DELETE & INSERT
IF TG_OP = 'UPDATE' THEN
   IF OLD.hostnames IS DISTINCT FROM NEW.hostnames THEN  -- keep going
   ELSE RETURN NEW;  -- exit, nothing to do
   END IF;
END IF;

IF TG_OP IN ('DELETE', 'UPDATE') THEN
   DELETE FROM hostname h
   USING  unnest(OLD.hostnames) d(x)
   WHERE  h.hostname = d.x;

   IF TG_OP = 'DELETE' THEN RETURN OLD;  -- exit, we are done
   END IF;
END IF;

-- control only reaches here for INSERT or UPDATE (with actual changes)
INSERT INTO hostname(hostname)
SELECT h
FROM   unnest(NEW.hostnames) h;

RETURN NEW;
END
$func$ LANGUAGE plpgsql;

触发:

CREATE TRIGGER host_insupdelbef
BEFORE INSERT OR DELETE OR UPDATE OF hostnames ON host
FOR EACH ROW EXECUTE PROCEDURE trg_host_insupdelbef();

SQL小提琴 与试运行。

SQL Fiddle with test run.

在阵列列使用 GIN指数 host.hostnames 和<一个href=\"http://stackoverflow.com/questions/11231544/check-if-value-exists-in-postgres-array/11231965#11231965\">array运营商 与它的工作:

Use a GIN index on the array column host.hostnames and array operators to work with it:

  • Why isn't my PostgreSQL array index getting used (Rails 4)?
  • Check if any of a given array of values are present in a Postgres array

这篇关于PostgreSQL的可以对数组元素的唯一性约束?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆