提案：本地唯一的GUID替代方案 [英] Proposal: locally unique GUID alternative

查看：102 发布时间：2018/8/2 13:40:11 indexing primary-key collision guid identifier

本文介绍了提案：本地唯一的GUID替代方案的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

问题

我正在寻找有关探索本地唯一GUID替代方案的反馈 ，具有以下要求：

I'm looking for feedback on this exploration of a locally unique alternative for GUIDs, with the following requirements:

碰撞的可能性非常低（我们宁愿每年碰撞一次而不是执行检查）

不泄漏敏感信息，例如存在多少项

在SQL数据库中具有高性能

是否可以复制/匹配手动查询（作为查询字符串和查询结果）

可用作不带编码的URI组件

Has very low chance of collisions (to the point that we'd rather collide once a year than perform checks)
Does not leak sensitive information, such as how many items exist
Has high performance in an SQL database
Is copy/pastable for manual querying (as both query string and query result)
Is usable as a URI component without encoding

为了满足要求，我决定采用 64位无符号整数的形式。它很容易在CPU上使用，对于主键使用很好和很小，半人类可读，仅数字，并且在手动查询时易于复制/粘贴。（作为反例，BLOB严重阻碍了对大多数SQL数据库的手动查询。）

To meet the requirements, I have decided on the form of a 64-bit unsigned integer. It is easy on the CPU, nice and small for primary key use, semi-human-readable, digits-only, and easy to copy/paste when querying by hand. (As a counter-example, BLOBs severely hinder manual querying on most SQL databases.)

此外，Percona 演示 单调递增值作为主键执行得更好，尤其是插入速度，所以这是一个目标特征。

Additionally, Percona demonstrates that monotonically increasing values perform much better as primary keys, particularly on insertion speed, so this is a trait to aim for.

建议结构

从左到右，最重要的位在左侧

46位。的时间戳即可。 Unix时间，以毫秒为单位。（至少在C＃中，亚毫秒时间不可用。）这将持续到4199年的某个地方。它给我们单调递增的值。

8位。 本地IP的一部分。机器内部IP地址的最后一个组件，即最快的可用网络接口。大多数服务器都应该是以太网LAN。

10位。的 Uniquefier 即可。一个静态计数器，在使用时递增（互锁），带有环绕。

46 bits. Timestamp. Unix time in milliseconds. (At least in C#, sub-millisecond time is not readily available.) This will last until somewhere in the year 4199. It gives us monotonically increasing values.
8 bits. Part of local IP. The last component of the machine's internal IP address, of the fastest available network interface. Should be ethernet LAN for most servers.
10 bits. Uniquefier. A static counter that is incremented (interlocked) on use, with wrap-around.

碰撞

每当有以下情况发生1/1024（~0.1％）的碰撞机会：

There is a 1/1024 (~0.1%) collision chance whenever:

两个系统共享相同的最后一个IP地址组件和以相同的毫秒进行呼叫。 这可以完全避免。

系统的时钟被转回并且它会在同一毫秒的呼叫中进行呼叫在时间变化之前。 这应该是一种非常罕见的情况，似乎符合要求。

Two systems share the same last IP address component and make a call in the same millisecond. This can be completely avoided.
A system's clock is turned back and it makes a call on the same millisecond of a call from before the time change. This should be a highly infrequent situation that seems within the requirements.

限制

有趣的是，我们似乎满足了要求（＃2是一个狡猾的要求）。让我们来看看一些限制。

Interestingly, we seem to be meeting the requirements (#2 being a dodgy one). Let's take a look at some of the limitations.

必须仔细维护服务器的本地IP地址 - 即使在不同的数据中心之间也是如此（如果适用）。

如果存在对IP的其他限制，我们不能支持超过255台服务器 - 可能更少。

我们泄漏有关创建了哪些标识符的信息由同一台服务器。我相信这也是许多GUID实现的情况。但是。

通过检查用户自己的请求之间的计数器增量，可以获得有关流量的信息。由于计数器用于各种数据，迅速增加并且难以归因于任何特定类型的数据，因此有效性降低。

标识符很多比那些有很多随机性的东西更可猜测。蛮力攻击每次尝试毫秒需要大约512次调用（uniquefier）。理想情况下，这种攻击不产生任何结果，即系统报告未授权，无论标识符是否不存在或不属于用户，并且抵抗定时攻击。实际上，让我们假设专门的攻击者会发现泄密。

Servers' local IP addresses must be carefully maintained - even amongst different data centers, if applicable.
We cannot support more than 255 servers - possibly fewer, if other constraints on the IPs exist.
We leak information about which identifiers were created by the same server. I believe this is the case with many GUID implementations also, however.
Information can be obtained about traffic volumes, by checking counter increments between a user's own requests. The effectiveness is diminished by the fact that the counter is used for various kinds of data, increasing rapidly and in a way that is hard to ascribe to any particular type of data.
Identifiers are much more guessable than ones that have plenty of randomness. A brute-force attack would need about 512 calls (uniquefier) per attempted millisecond. Ideally, this attack yields nothing, i.e. the system reports "unauthorized" regardless of whether the identifier is non-existent or does not belong to the user, and is resistant to timing attacks. Realistically, let's assume a dedicated attacker will find a leak.

注意事项

限制＃1和＃2必须适合公司。

Limitations #1 and #2 must simply fit the company.

限制＃ 3在现有的GUID实现中似乎被认为是可接受的，并且是我愿意接受的。

Limitation #3 appears to be deemed acceptable in existing GUID implementations, and is one I am willing to live with.

限制＃4是一个棘手的问题。这些信息有多敏感？因此，我们每分钟进行10K插入，进入未知数量的表格。相对数量确实提供了更多的洞察力：在08：00-09：00之间，活动的时间是一小时前的两倍。不过，这通常是特定领域的常识。意外的峰值可能会泄漏更多信息。所以系统在早上03:00努力工作。这有多糟糕？从暴露自动增量标识符的公司数量来看，我们可能会说这是一种改进，而不是......但它可能是一个交易破坏者。

Limitation #4 is a tricky one. How sensitive is this information? "So we do a 10K inserts per minute, into an unknown number of tables." Relative volumes do grant more insight: "Between 08:00-09:00 there is twice as much activity as an hour earlier." Still, this will usually be common knowledge in a given field. Unexpected peaks might leak some more info. "So the system works hard at 03:00 in the morning." How bad is all this? Judging by the number of companies that expose auto-increment identifiers, we might say it's an improvement more often than not... But it could be a deal-breaker.

我们可以使用（加密）随机位作为处理限制＃4的唯一文件，但这会引入第三次碰撞机会：每当系统在一毫秒内生成多个标识符时。生日悖论在那里特别成问题。

We could use (crypto)random bits as the uniquefier to deal with limitation #4, but that would introduce a third collision opportunity: whenever the system generates multiple identifiers within a millisecond. The birthday paradox is particularly problematic there.

如果我们允许时间戳已经在2527中回绕，我们可以释放2位。自私和对后代不敏感，或傲慢地假设我们的代码会被更长时间使用？： - ）

We could free up 2 bits if we allow the timestamp to wrap around in 2527 already. Selfish and insensitive to future generations, or arrogant to assume that our code would be used longer? :-)

还有什么？

我欢迎您错过我的反馈，改进，想法和限制！你会如何解决这个问题？

I welcome your feedback, improvements, ideas, limitations that I have missed! How would you solve this problem?

提案：本地唯一的GUID替代方案 [英] Proposal: locally unique GUID alternative

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

提案：本地唯一的GUID替代方案 [英] Proposal: locally unique GUID alternative

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭