分布式序列号生成? [英] Distributed sequence number generation?

查看:287
本文介绍了分布式序列号生成?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我通常使用过去的数据库序列实现序列号生成

I've generally implemented sequence number generation using database sequences in the past.

例如。使用Postgres SERIAL类型 http://www.neilconway.org/docs/sequences/

e.g. Using Postgres SERIAL type http://www.neilconway.org/docs/sequences/

我很好奇,但是如何为没有数据库的大型分布式系统生成序列号。对于多个客户端,是否有任何经验或建议以线程安全方式生成序列号生成的最佳实践?

I'm curious though as how to generate sequence numbers for large distributed systems where there is no database. Does anybody have any experience or suggestions of a best practice for achieving sequence number generation in a thread safe manner for multiple clients?

推荐答案

好的,这是一个非常古老的问题,我现在第一次看到。

你需要区分序列号唯一ID (可选)可按特定条件(通常为生成时间)进行松散排序。真正的序列号意味着知道所有其他工作者所做的事情,因此需要共享状态。没有简单的方法以分布式,高规模的方式这样做。您可以查看网络广播,每个工作人员的窗口范围以及分发的内容用于唯一工作者ID的哈希表,但这是很多工作。

You'll need to differentiate between sequence numbers and unique IDs that are (optionally) loosely sortable by a specific criteria (typically generation time). True sequence numbers imply knowledge of what all other workers have done, and as such require shared state. There is no easy way of doing this in a distributed, high-scale manner. You could look into things like network broadcasts, windowed ranges for each worker, and distributed hash tables for unique worker IDs, but it's a lot of work.

唯一ID是另一回事,有几种很好的方法可以生成唯一分散式ID:

a)您可以使用 Twitter的Snowflake ID网络服务 Snowflake是:

a) You could use Twitter's Snowflake ID network service. Snowflake is a:


  • 网络服务,即您进行网络呼叫以获取唯一ID;

  • 生成按生成时间排序的64位唯一ID;

  • 该服务具有高度可扩展性和(可能)高度可用性;每个实例每秒可以生成数千个ID,您可以在LAN / WAN上运行多个实例;

  • 用Scala编写,在JVM上运行。

  • Networked service, i.e. you make a network call to get a unique ID;
  • which produces 64 bit unique IDs that are ordered by generation time;
  • and the service is highly scalable and (potentially) highly available; each instance can generate many thousand IDs per second, and you can run multiple instances on your LAN/WAN;
  • written in Scala, runs on the JVM.

b)您可以使用方法在客户端上生成唯一ID / wiki / UNIVERSally_Unique_Identifierrel =noreferrer>如何制作UUID 和Snowflake的ID。有多种选择,但有些内容如下:

b) You could generate the unique IDs on the clients themselves, using an approach derived from how UUIDs and Snowflake's IDs are made. There are multiple options, but something along the lines of:


  • 最重要的40位左右:时间戳; ID的生成时间。 (我们使用时间戳的最高位来按生成时间对ID进行排序。)

  • The most significant 40 or so bits: A timestamp; the generation time of the ID. (We're using the most significant bits for the timestamp to make IDs sort-able by generation time.)

接下来的14位左右:每个发生器计数器,,每个生成器为每个生成的新ID递增1。这可确保在同一时刻生成的ID(相同时间戳)不会重叠。

The next 14 or so bits: A per-generator counter, which each generator increments by one for each new ID generated. This ensures that IDs generated at the same moment (same timestamps) do not overlap.

最后10位:每个位的唯一值生成器。使用它,我们不需要在生成器之间进行任何同步(这非常困难),因为所有生成器都会因为此值而生成不重叠的ID。

The last 10 or so bits: A unique value for each generator. Using this, we don't need to do any synchronization between generators (which is extremely hard), as all generators produce non-overlapping IDs because of this value.

c)您可以使用时间戳和随机值在客户端上生成ID。这样就无需了解所有内容生成器,并为每个生成器分配唯一值。另一方面,这些ID不是保证是全局唯一的,它们很可能是唯一的。 (为了碰撞,一个或多个生成器必须在同一时间创建相同的随机值。)类似于:

c) You could generate the IDs on the clients, using just a timestamp and random value. This avoids the need to know all generators, and assign each generator a unique value. On the flip side, such IDs are not guaranteed to be globally unique, they're only very highly likely to be unique. (To collide, one or more generators would have to create the same random value at the exact same time.) Something along the lines of:


  • 最重要的32位:时间戳, ID的生成时间。

  • 最低有效32位: 32位随机性,为每个ID重新生成。

  • The most significant 32 bits: Timestamp, the generation time of the ID.
  • The least significant 32 bits: 32-bits of randomness, generated anew for each ID.

d)简单的出路,使用UUID / GUID

这篇关于分布式序列号生成?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆