postgresql 生成没有间隙的序列 [英] postgresql generate sequence with no gap

查看:29
本文介绍了postgresql 生成没有间隙的序列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我必须/必须为发票创建唯一 ID.我有一个表 id 和这个唯一编号的另一列.我使用序列化隔离级别.使用

 var seq = @"SELECT invoice_serial + 1 FROM invoice WHERE ""type""=@type ORDER BY invoice_serial DESC LIMIT 1";

没有帮助,因为即使使用 FOR UPDATE 它也不会像序列化级别那样读取正确的值.

似乎唯一的解决方案是放了一些重试代码.

解决方案

序列不会生成无间隙的数字集,实际上没有办法让它们这样做,因为回滚或错误会使用"序列号.

我不久前写了一篇关于此的文章.它针对 Oracle,但实际上是关于无间隙数字的基本原则,我认为这同样适用于此.

<块引用><块引用>

好吧,它又发生了.有人询问如何实现产生间隙的要求-free 系列的数字和一大群反对者纷纷对他们说(在这里我稍微解释一下)这会扼杀系统性能,这很少是一个有效的要求,写这个要求的人是个白痴等等.

正如我在线程中指出的那样,生成无间隙数字序列有时是真正的法律要求.英国超过 2,000,000 家注册了 VAT(销售税)的组织的发票编号有这样的要求,其原因很明显:这使得向税务机关隐瞒收入的产生变得更加困难.我看到有人评论说这是西班牙和葡萄牙的要求,如果其他许多国家/地区没有要求,我也不会感到惊讶.

那么,如果我们接受这是一个有效的要求,那么在什么情况下数字的无间隙系列*会成为问题?集体思考通常会让你相信它总是如此,但实际上这只是在非常特殊的情况下的潜在问题.

  1. 这一系列数字必须没有间隙.
  2. 多个进程创建与编号关联的实体(例如发票).
  3. 数字必须在创建实体时生成.

<块引用><块引用>

如果必须满足所有这些要求,那么您的应用程序中有一个序列化点,我们稍后会讨论.

首先让我们谈谈实现一系列数字要求的方法,如果您可以删除这些要求中的任何一个.

如果您的数字系列可能有间隙(并且您有多个进程需要立即生成数字),则使用 Oracle Sequence 对象.它们具有非常高的性能,并且已经很好地讨论了可以预期存在差距的情况.如果这很重要,那么通过设计努力来最大限度地减少在生成数字和提交事务之间过程失败的可能性,从而最大限度地减少跳过的数字数量并不太具有挑战性.

如果您没有创建实体的多个流程(并且您需要一系列必须立即生成的无间隙数字),就像批量生成发票的情况一样,那么您已经有了一个点连载.这本身可能不是问题,并且可能是执行所需操作的有效方式.在这种情况下,生成无间隙数字相当简单.您可以使用多种技术读取当前最大值并将递增值应用于每个实体.例如,如果您要从临时工作表向您的发票表中插入一批新发票,您可能会:

插入发票(发票#,...)使用 curr 作为 (选择 Coalesce(Max(invoice#)) max_invoice#从发票)选择curr.max_invoice#+rownum,...从tmp_invoice...

<块引用><块引用>

当然,您会保护您的进程,以便一次只能运行一个实例(如果您使用的是 Oracle,则可能使用 DBMS_Lock),并使用唯一的密钥约束保护发票#,并可能使用以下命令检查缺失值如果您真的非常关心,请单独使用代码.

如果您不需要即时生成数字(但您需要它们无间隙且多个进程生成实体),那么您可以允许生成实体并提交事务,然后将生成数字留给单个批处理作业.更新实体表,或插入到单独的表中.

那么,如果我们需要通过多个进程即时生成无间隙数字系列的三重奏?我们所能做的就是尽量减少过程中连载的时间,我提供以下建议,欢迎任何额外的建议(当然也可以是反建议).

  1. 将您的当前值存储在专用表中.不要使用序列.
  2. 通过将其封装在函数或过程中,确保所有进程使用相同的代码来生成新数字.
  3. 使用 DBMS_Lock 序列化对数字生成器的访问,确保每个系列都有自己的专用锁.
  4. 保持系列生成器中的锁,直到通过在提交时释放锁来完成实体创建事务
  5. 将数字的生成延迟到最后一刻.
  6. 考虑在生成数字之后和提交完成之前意外错误的影响——应用程序会优雅地回滚并释放锁,还是会在系列生成器上保持锁直到会话稍后断开连接?无论使用何种方法,如果交易失败,则序列号必须返回到池中".
  7. 您能否将整个内容封装在实体表上的触发器中?您能否将其封装在表或其他 API 调用中,以自动插入行并提交插入?

原创文章

I must / have to create unique ID for invoices. I have a table id and another column for this unique number. I use serialization isolation level. Using

  var seq = @"SELECT invoice_serial + 1 FROM  invoice WHERE ""type""=@type ORDER BY invoice_serial DESC LIMIT 1";

Doesn't help because even using FOR UPDATE it wont read correct value as in serialization level.

Only solution seems to put some retry code.

解决方案

Sequences do not generate gap-free sets of numbers, and there's really no way of making them do that because a rollback or error will "use" the sequence number.

I wrote up an article on this a while ago. It's directed at Oracle but is really about the fundamental principles of gap-free numbers, and I think the same applies here.

Well, it’s happened again. Someone has asked how to implement a requirement to generate a gap-free series of numbers and a swarm of nay-sayers have descended on them to say (and here I paraphrase slightly) that this will kill system performance, that’s it’s rarely a valid requirement, that whoever wrote the requirement is an idiot blah blah blah.

As I point out on the thread, it is sometimes a genuine legal requirement to generate gap-free series of numbers. Invoice numbers for the 2,000,000+ organisations in the UK that are VAT (sales tax) registered have such a requirement, and the reason for this is rather obvious: that it makes it more difficult to hide the generation of revenue from tax authorities. I’ve seen comments that it is a requirement in Spain and Portugal, and I’d not be surprised if it was not a requirement in many other countries.

So, if we accept that it is a valid requirement, under what circumstances are gap-free series* of numbers a problem? Group-think would often have you believe that it always is, but in fact it is only a potential problem under very particular circumstances.

  1. The series of numbers must have no gaps.
  2. Multiple processes create the entities to which the number is associated (eg. invoices).
  3. The numbers must be generated at the time that the entity is created.

If all of these requirements must be met then you have a point of serialisation in your application, and we’ll discuss that in a moment.

First let’s talk about methods of implementing a series-of-numbers requirement if you can drop any one of those requirements.

If your series of numbers can have gaps (and you have multiple processes requiring instant generation of the number) then use an Oracle Sequence object. They are very high performance and the situations in which gaps can be expected have been very well discussed. It is not too challenging to minimise the amount of numbers skipped by making design efforts to minimise the chance of a process failure between generation of the number and commiting the transaction, if that is important.

If you do not have multiple processes creating the entities (and you need a gap-free series of numbers that must be instantly generated), as might be the case with the batch generation of invoices, then you already have a point of serialisation. That in itself may not be a problem, and may be an efficient way of performing the required operation. Generating the gap-free numbers is rather trivial in this case. You can read the current maximum value and apply an incrementing value to every entity with a number of techniques. For example if you are inserting a new batch of invoices into your invoice table from a temporary working table you might:

insert into
  invoices
    (
    invoice#,
    ...)
with curr as (
  select Coalesce(Max(invoice#)) max_invoice#
  from   invoices)
select
  curr.max_invoice#+rownum,
  ...
from
  tmp_invoice
  ...

Of course you would protect your process so that only one instance can run at a time (probably with DBMS_Lock if you're using Oracle), and protect the invoice# with a unique key contrainst, and probably check for missing values with separate code if you really, really care.

If you do not need instant generation of the numbers (but you need them gap-free and multiple processes generate the entities) then you can allow the entities to be generated and the transaction commited, and then leave generation of the number to a single batch job. An update on the entity table, or an insert into a separate table.

So if we need the trifecta of instant generation of a gap-free series of numbers by multiple processes? All we can do is to try to minimise the period of serialisation in the process, and I offer the following advice, and welcome any additional advice (or counter-advice of course).

  1. Store your current values in a dedicated table. DO NOT use a sequence.
  2. Ensure that all processes use the same code to generate new numbers by encapsulating it in a function or procedure.
  3. Serialise access to the number generator with DBMS_Lock, making sure that each series has it’s own dedicated lock.
  4. Hold the lock in the series generator until your entity creation transaction is complete by releasing the lock on commit
  5. Delay the generation of the number until the last possible moment.
  6. Consider the impact of an unexpected error after generating the number and before the commit is completed — will the application rollback gracefully and release the lock, or will it hold the lock on the series generator until the session disconnects later? Whatever method is used, if the transaction fails then the series number(s) must be "returned to the pool".
  7. Can you encapsulate the whole thing in a trigger on the entity’s table? Can you encapsulate it in a table or other API call that inserts the row and commits the insert automatically?

Original article

这篇关于postgresql 生成没有间隙的序列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆