PostgreSQL 无缝序列 [英] PostgreSQL gapless sequences

查看:36
本文介绍了PostgreSQL 无缝序列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在从 MySql 转移到 Postgres,我注意到当您从 MySql 中删除行时,当您创建新行时,这些行的唯一 ID 会被重新使用.使用 Postgres,如果您创建行并删除它们,则不会再次使用唯一 ID.

I'm moving from MySql to Postgres, and I noticed that when you delete rows from MySql, the unique ids for those rows are re-used when you make new ones. With Postgres, if you create rows, and delete them, the unique ids are not used again.

在 Postgres 中是否有这种行为的原因?在这种情况下,我可以让它更像 MySql 吗?

Is there a reason for this behaviour in Postgres? Can I make it act more like MySql in this case?

推荐答案

序列有间隙以允许并发插入.试图避免间隙或重新使用已删除的 ID 会产生可怕的性能问题.请参阅

Sequences have gaps to permit concurrent inserts. Attempting to avoid gaps or to re-use deleted IDs creates horrible performance problems. See the PostgreSQL wiki FAQ.

PostgreSQL SEQUENCEs用于分配 ID.这些只会增加,并且它们不受通常的事务回滚规则的约束,以允许多个事务同时获取新 ID.这意味着如果事务回滚,这些 ID 将被丢弃";没有保留免费"ID 的列表,只有当前的 ID 计数器.如果数据库不正常关闭,序列通常也会递增.

PostgreSQL SEQUENCEs are used to allocate IDs. These only ever increase, and they're exempt from the usual transaction rollback rules to permit multiple transactions to grab new IDs at the same time. This means that if a transaction rolls back, those IDs are "thrown away"; there's no list of "free" IDs kept, just the current ID counter. Sequences are also usually incremented if the database shuts down uncleanly.

合成密钥 (ID) 无论如何毫无意义.它们的顺序并不重要,它们唯一的重要属性是唯一性.您无法有意义地衡量两个 ID 的相距"有多远,也无法有意义地判断一个 ID 是大于还是小于另一个.你所能做的就是说相等"或不相等".其他的都是不安全的.你不应该关心差距.

Synthetic keys (IDs) are meaningless anyway. Their order is not significant, their only property of significance is uniqueness. You can't meaningfully measure how "far apart" two IDs are, nor can you meaningfully say if one is greater or less than another. All you can do is say "equal" or "not equal". Anything else is unsafe. You shouldn't care about gaps.

如果您需要一个可以重新使用已删除 ID 的无缝序列,您可以拥有一个,您只需要为此放弃大量的性能 - 特别是,您不能在 INSERTs ,因为您必须扫描表以获得最低的空闲 ID,锁定表以进行写入,因此其他事务无法声明相同的 ID.尝试搜索postgresql 无缝序列".

If you need a gapless sequence that re-uses deleted IDs, you can have one, you just have to give up a huge amount of performance for it - in particular, you cannot have any concurrency on INSERTs at all, because you have to scan the table for the lowest free ID, locking the table for write so no other transaction can claim the same ID. Try searching for "postgresql gapless sequence".

最简单的方法是使用计数器表和获取下一个 ID 的函数.这是使用计数器表生成连续无间隙 ID 的通用版本;不过,它不会重复使用 ID.

The simplest approach is to use a counter table and a function that gets the next ID. Here's a generalized version that uses a counter table to generate consecutive gapless IDs; it doesn't re-use IDs, though.

CREATE TABLE thetable_id_counter ( last_id integer not null );
INSERT INTO thetable_id_counter VALUES (0);

CREATE OR REPLACE FUNCTION get_next_id(countertable regclass, countercolumn text) RETURNS integer AS $$
DECLARE
    next_value integer;
BEGIN
    EXECUTE format('UPDATE %s SET %I = %I + 1 RETURNING %I', countertable, countercolumn, countercolumn, countercolumn) INTO next_value;
    RETURN next_value;
END;
$$ LANGUAGE plpgsql;

COMMENT ON get_next_id(countername regclass) IS 'Increment and return value from integer column $2 in table $1';

用法:

INSERT INTO dummy(id, blah) 
VALUES ( get_next_id('thetable_id_counter','last_id'), 42 );

请注意,当一个打开的事务获得 ID 时,所有其他尝试调用 get_next_id 的事务将阻塞,直到第一个事务提交或回滚.这是不可避免的,对于无间隙 ID 也是设计使然.

Note that when one open transaction has obtained an ID, all other transactions that try to call get_next_id will block until the 1st transaction commits or rolls back. This is unavoidable and for gapless IDs and is by design.

如果要在一个表中存储多个不同用途的计数器,只需在上述函数中添加一个参数,在计数器表中添加一列,并在WHERE子句中添加WHERE子句即可>UPDATE 将参数与添加的列匹配.这样您就可以拥有多个独立锁定的计数器行.不要不要为新计数器添加额外的列.

If you want to store multiple counters for different purposes in a table, just add a parameter to the above function, add a column to the counter table, and add a WHERE clause to the UPDATE that matches the parameter to the added column. That way you can have multiple independently-locked counter rows. Do not just add extra columns for new counters.

此功能不会重复使用已删除的 ID,它只是避免引入间隙.

This function does not re-use deleted IDs, it just avoids introducing gaps.

要重复使用 ID,我建议...不要重复使用 ID.

To re-use IDs I advise ... not re-using IDs.

如果你真的必须这样做,你可以通过在感兴趣的表上添加一个 ON INSERT OR UPDATE OR DELETE 触发器来实现,该触发器将删除的 ID 添加到空闲列表侧表,并将它们从INSERT 编辑的空闲列表表.将 UPDATE 视为 DELETE 后跟 INSERT.现在修改上面的 ID 生成函数,使其执行 SELECT free_id INTO next_value FROM free_ids FOR UPDATE LIMIT 1,如果找到,则 DELETE s 该行.IF NOT FOUND 像往常一样从生成器表中获取一个新 ID.这是先前函数的未经测试的扩展以支持重用:

If you really must, you can do so by adding an ON INSERT OR UPDATE OR DELETE trigger on the table of interest that adds deleted IDs to a free-list side table, and removes them from the free-list table when they're INSERTed. Treat an UPDATE as a DELETE followed by an INSERT. Now modify the ID generation function above so that it does a SELECT free_id INTO next_value FROM free_ids FOR UPDATE LIMIT 1 and if found, DELETEs that row. IF NOT FOUND gets a new ID from the generator table as normal. Here's an untested extension of the prior function to support re-use:

CREATE OR REPLACE FUNCTION get_next_id_reuse(countertable regclass, countercolumn text, freelisttable regclass, freelistcolumn text) RETURNS integer AS $$
DECLARE
    next_value integer;
BEGIN
    EXECUTE format('SELECT %I FROM %s FOR UPDATE LIMIT 1', freelistcolumn, freelisttable) INTO next_value;
    IF next_value IS NOT NULL THEN
        EXECUTE format('DELETE FROM %s WHERE %I = %L', freelisttable, freelistcolumn, next_value);
    ELSE
        EXECUTE format('UPDATE %s SET %I = %I + 1 RETURNING %I', countertable, countercolumn, countercolumn, countercolumn) INTO next_value;
    END IF;
    RETURN next_value;
END;
$$ LANGUAGE plpgsql;

这篇关于PostgreSQL 无缝序列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆