建议的解决方案：在分布式环境中生成唯一的ID [英] Proposed solution: Generate unique IDs in a distributed environment

查看：52 发布时间：2020/10/10 2:44:43 php distributed couchbase

本文介绍了建议的解决方案：在分布式环境中生成唯一的ID的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我一直在浏览网络，试图找到一种解决方案，使我们能够在区域分布的环境中生成唯一的ID。

I've been browsing the net trying to find a solution that will allow us to generate unique IDs in a regionally distributed environment.

我查看了以下选项（以及其他）：

I looked at the following options (among others):

雪花（通过Twitter）

这似乎是一个不错的解决方案，但我只是不喜欢必须管理另一个软件才能创建ID所带来的额外复杂性；

此阶段它缺少文档，因此我认为这不是一个很好的投资；

节点需要能够使用Zookeeper相互通信（延迟/通信失败是什么？）

UUID

看看： 550e8400-e29b-41d4-a716-446655440000 ；

它是一个128位ID；

有发生了一些已知的冲突（取决于我猜的版本）请参阅这篇文章。

Just look at it: 550e8400-e29b-41d4-a716-446655440000;
Its a 128 bit ID;
There has been some known collisions (depending on the version I guess) see this post.

关系数据库中的自动添加（如MYSQL）

这似乎很安全，但不幸的是，我们没有使用关系数据库（可伸缩性首选项）；

我们可以像Flickr一样为此部署MySQL服务器，但是同样，这又引入了另一个故障点/瓶颈。也增加了复杂性。

非关系数据库中的自动增加，例如库克

这可以工作，因为我们使用Couchbase作为我们的数据库服务器，但是;

当我们拥有超过一个位于不同区域的群集，延迟问题，网络故障：在某些时候，ID会根据通信量发生冲突；

让我们说我们有由10个Couchbase节点和10个Couchbase节点组成的集群5个不同地区（非洲，欧洲，亚洲，美洲和大洋洲）的应用程序节点。这是为了确保从最靠近用户的位置提供内容（以提高速度），并确保在发生灾难等情况下实现冗余。

Lets say that we have clusters consisting of 10 Couchbase Nodes and 10 Application nodes in 5 different regions (Africa, Europe, Asia, America and Oceania). This is to ensure that content is served from a location closest to the user (to boost speed) and to ensure redundancy in case of disasters etc.

现在，任务是生成不会在复制（和平衡）发生时发生冲突的ID，我认为这可以通过3个步骤实现：

Now, the task is to generate IDs that wont collide when the replication (and balancing) occurs and I think this can be achieved in 3 steps:

步骤1

将为所有区域分配整数ID（唯一标识符）：

All regions will be assigned integer IDs (unique identifiers):

1-非洲；

2-美国；

3-亚洲；

4-欧洲；

5-大洋洲。

1 - Africa;
2 - America;
3 - Asia;
4 - Europe;
5 - Ociania.

第2步

为添加到群集中的每个应用程序节点分配一个ID，请记住，一个群集中最多可能有99 999台服务器（即使我怀疑：出于安全起见）。看起来像这样（假IP）：

Assign an ID to every Application node that is added to the cluster keeping in mind that there may be up to 99 999 servers in one cluster (even though I doubt: just as a safely precaution). This will look something like this (fake IPs):

00001-192.187.22.14

00002 -164.254.58.22

00003-142.77.22.45

依此类推。

00001 - 192.187.22.14
00002 - 164.254.58.22
00003 - 142.77.22.45
and so forth.

请注意，所有这些都在同一群集中，因此这意味着每个区域都可以拥有节点00001。

Please note that all of these are in the same cluster, so that means you can have node 00001 per region.

步骤3

对于插入到数据库中的每条记录，将使用递增的ID来标识它，这就是它的工作方式：

For every record inserted into the database, an incremented ID will be used to identify it, and this is how it will work:

Couchbase提供了增量功能，可用于在群集内部内部创建ID。为确保冗余，将在群集内创建3个副本。由于它们位于同一位置，因此我认为可以肯定地假设，除非整个群集都关闭，否则负责此操作的节点之一将可用，否则可以增加许多副本。

Couchbase offers an increment feature that we can use to create IDs internally within the cluster. To ensure redundancy, 3 replicas will be created within the cluster. Since these are in the same place, I think it should be safe to assume that unless the whole cluster is down, one of the nodes responsible for this will be available, otherwise a number of replicas can be increased.

将它们组合在一起

说一个用户正在从欧洲注册：
该应用程序服务请求的节点将获取区域代码（在这种情况下为 4 ），获得其自己的ID（例如， 00005 ），然后获得一个递增的ID（ 1 ）（来自同一群集）。

Say a user is signing up from Europe: The application node serving the request will grab the region code (4 in this case), get its own ID (say 00005) and then get an incremented ID (1) from Couchbase (from the same cluster).

我们最终得到3个组成部分： 4，00005,1 。现在，要从中创建ID，我们只需将这些组件加入 4.00005.1 。为了使其变得更好（我不太确定），我们可以连接（不添加它们）以得到最终的组件： 4000051 。

We end up with 3 components: 4, 00005,1. Now, to create an ID from this, we can just join these components into 4.00005.1. To make it even better (I'm not too sure about this), we can concatenate (not add them up) the components to end up with: 4000051.

在代码中，看起来像这样：

In code, this will look something like this:

$ id ='4'。'00005'。'1';

NB：不是 $ id = 4 + 00005 + 1; 。

专业人士

ID看起来比UUID好；

它们看起来足够独特。即使另一个区域中的一个节点生成了与上面相同的增量ID，并且具有与上述相同的节点ID，我们仍然始终具有将它们分开的区域代码；

它们仍然可以存储作为整数（可能是大无符号整数）；

这都是体系结构的一部分，没有增加的复杂性。

IDs look better than UUIDs;
They seem unique enough. Even if a node in another region generated the same incremented ID and has the same node ID as the one above, we always have the region code to set them apart;
They can still be stored as integers (probably Big Unsigned integers);
It's all part of the architecture, no added complexities.

缺点

没有排序（或没有排序）吗？

这是我最需要您输入的地方

No sorting (or is there)?
This is where I need your input (most)

我知道每个解决方案都有缺陷，而且可能比我们在表面上看到的还要多。您能发现整个方法的任何问题吗？

I know that every solution has flaws, and possibly more that what we see on the surface. Can you spot any issues with this whole approach?

在此先感谢您的帮助：-）

Thank you in advance for your help :-)

编辑

根据@DaveRandom的建议，我们可以添加第4步：

As @DaveRandom suggested, we can add the 4th step:

第4步

我们可以生成一个随机数并将其附加到ID上以防止可预测性。实际上，您最终得到的是这样的东西：

We can just generate a random number and append it to the ID to prevent predictability. Effectively, you end up with something like this:

4000051357 而不只是 4000051 。

建议的解决方案：在分布式环境中生成唯一的ID [英] Proposed solution: Generate unique IDs in a distributed environment

问题描述

推荐答案

相关文章

PHP最新文章

热门教程

热门工具

登录关闭

建议的解决方案：在分布式环境中生成唯一的ID [英] Proposed solution: Generate unique IDs in a distributed environment

问题描述

推荐答案

相关文章

PHP最新文章

热门教程

热门工具

登录 关闭

登录关闭