在CouchDB中生成自动递增数字ID的方法 [英] Approaches to generate auto-incrementing numeric ids in CouchDB

查看:125
本文介绍了在CouchDB中生成自动递增数字ID的方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

由于CouchDB不像AUTO_INCREMENT一样支持SQL,因此为文档生成顺序唯一数字ID的方法是什么?

Since CouchDB does not have support for SQL alike AUTO_INCREMENT what would be your approach to generate sequential unique numeric ids for your documents?

我将数字ID用于:

  • 用户友好的ID(例如TASK-123,RQ-001等)
  • 与需要数字主键的库/系统集成

我知道复制等问题.这就是为什么我对人们如何克服此问题感兴趣的原因.

I am aware of the problems with replication, etc. That's why I am interested in how people try to overcome this issue.

推荐答案

正如多米尼克·巴恩斯(Dominic Barnes)所说,自动增量整数不可扩展,对分布式不友好或对云不友好.如今,似乎每个应用程序都需要具有脱机支持的移动版本,并且该版本与自动增量整数不直接兼容.我们都知道这一点,但这是真的:自动增量整数对于遗留代码和可能的其他内容是必需的.

As Dominic Barnes says, auto-increment integers are not scalable, not distributed-friendly or cloud-friendly. It seems every app nowadays needs a mobile version with offline support, and that is not directly compatible with auto-increment integers. We all know this, but it's true: auto-increment integers are necessary for legacy code and arguably other stuff.

在两种情况下,您都有责任产生自动递增的整数.视图正在运行emit(the_numeric_id, null). (您也可以有一个类型"名称空间,例如通过emit([doc.type, the_numeric_id], null).查询最后一行(例如,使用startkey=MAXINT&descending=true&limit=1,递增返回的值,这是您的下一个ID.保存尝试是在循环中如果发生冲突,可以重试.

In both scenarios, you are responsible for producing the auto-incrementing integer. A view is running emit(the_numeric_id, null). (You could also have a "type" namespace, e.g. by emit([doc.type, the_numeric_id], null). Query for the final row (e.g. with a startkey=MAXINT&descending=true&limit=1, increment the value returned, and that is your next id. The attempt to save is in a loop which can retry if there was a collision.

如果您不需要ID列表的100%密度,也可以玩一些技巧.例如,您可以将时间戳添加到emit()行,并估计文档创建速度,并以该速度乘以计算和传输时间.您还可以简单地在1到N之间添加一个随机整数,因此大多数情况下,第一次插入都会起作用,但代价是ID编号不一致.

You can also play tricks if you don't need 100% density of the list of IDs. For example, you can add timestamps to the emit() rows, and estimate the document creation velocity, and increment by that velocity times your computation and transmit time. You could also simply increment by a random integer between 1 and N, so most of the time the first insert works, at a cost of non-homogeneous ID numbers.

关于存储整数的位置,我认为有 id 策略和 try and check 策略.

About where to store the integer, I think there is the id strategy and the try and check strategy.

id 策略在短期内更加简单快捷.文档ID是整数(可能以添加名称空间的类型为前缀).由于Couch保证_id字段的唯一性,因此您只需要担心自动递增.循环执行此操作:409 Conflict触发重试,201 Accepted表示已完成.

The id strategy is simpler and quicker in the short term. Document IDs are an integer (perhaps prefixed with a type to add a namespace). Since Couch guarantees uniqueness on the _id field, you just worry about the auto-incrementing. Do this in a loop: 409 Conflict triggers a retry, 201 Accepted means you're done.

我认为此技巧的主要缺点是,如果以及何时发生冲突,您有两个完全不相关的文档,其中一个必须复制到新文件.如果与其他文档有关系,则必须全部更正. (我想到了CouchDB 0.11 emit(key, {_id: some_foreign_doc_id})技巧.)

I think the major pain with this trick is, that if and when you get conflicts, you have two completely unrelated documents, and one of them must be copied into a fresh document. If there were relationships with other documents, they must all be corrected. (The CouchDB 0.11 emit(key, {_id: some_foreign_doc_id}) trick comes to mind.)

尝试并检查策略使用默认的UUID作为doc._id,因此每次插入都会成功.理想情况下,所有或大部分文档间关系都基于不变的UUID _id,而不是整数.这仅用于用户和UI.自动递增整数只是文档{"int_id":20}中的一个字段.视图当然是emit(doc.int_id, null). (您可以使用视图的?key=23?include_docs=true参数按整数ID查找文档.

The try and check strategy uses the default UUID as the doc._id, so every insert will succeed. Ideally, all or most of your inter-document relations are based on the immutable UUID _id, not the integer. That is just used for users and UI. The auto-incrementing integer is simply a field in the document, {"int_id":20}. The view of course does emit(doc.int_id, null). (You can look up a document by integer id with a ?key=23?include_docs=true parameter of the view.

当然,复制后,您可能会遇到id冲突(不是正式的CouchDB冲突,而只是使用相同数字id的文档).通过ID发出的视图也将具有一个减少阶段:只需_count就足够了.接下来,您必须巡逻数据库,并使用?group=true查询此视图,并查找计数> 1的任何行(对应于整数id).在正方面,更正文档的数字id是次要的更改,因为它不需要创建新文档.

Of course, after a replication, you might have id conflicts (not official CouchDB conflicts, but just documents using the same numeric id). The view which emits by ID would also have a reduce phase: simply _count should be enough. Next you must patrol the DB, querying this view with ?group=true and looking for any row (corresponding to an integer id) which has a count > 1. On the plus side, correcting the numeric id of a document is a minor change because it does not require new document creation.

这些是我的想法.现在,我将它们写下来,我觉得无论id存储在什么地方,您都必须进行关系加密.因此也许使用_id毕竟更好.我看到的唯一另一个缺点是,您已经永久地使用了一个根本破损的命名模型-对于永久"的某些定义.

Those are my ideas. Now that I wrote them down, I feel like you must do relation-shepherding regardless of where the id is stored; so perhaps using _id is better after all. The only other downside I see is that you are permanently married to a fundamentally broken naming model—for some definition of "permanently."

这篇关于在CouchDB中生成自动递增数字ID的方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆