生成 v5 UUID.什么是名称和命名空间? [英] Generating v5 UUID. What is name and namespace?

查看:34
本文介绍了生成 v5 UUID.什么是名称和命名空间?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经阅读了 man 页面,但我不明白 namenamespace 的用途.

I've read the man page, but I do not understand what name and namespace are for.

对于版本 3 和版本 5 UUID,附加命令行必须给出参数命名空间和名称.命名空间是字符串表示形式的 UUID 或内部预定义命名空间 UUID 的标识符(目前已知的是ns:DNS"、ns:URL"、ns:OID"和ns:X500").这name 是任意长度的字符串.

For version 3 and version 5 UUIDs the additional command line arguments namespace and name have to be given. The namespace is either a UUID in string representation or an identifier for internally pre-defined namespace UUIDs (currently known are "ns:DNS", "ns:URL", "ns:OID", and "ns:X500"). The name is a string of arbitrary length.

命名空间:

命名空间要么是字符串表示形式的 UUID,要么是一个

The namespace is either a UUID in string representation or an

这是否意味着我需要将它 (UUID v4) 存储在与生成的 UUID v5 相关的某个地方?无论哪种情况,为什么这不是自动完成的?

Does it mean that I need to store it (UUID v4) somewhere in relation to the generated UUID v5? In either case, why is this not done automatically?

名称是任意长度的字符串.

The name is a string of arbitrary length.

name 一个完全随机的字符串?那它的目的是什么?可以从 UUID v5 解码吗?

name a completely random string? What is the purpose of it then? Can it be decoded from the UUID v5?

推荐答案

名称和命名空间可用于创建(很可能)唯一 UUID 的层次结构.

Name and namespace can be used to create a hierarchy of (very probably) unique UUIDs.

粗略地说,类型 3 或类型 5 UUID 是通过将命名空间标识符与名称散列在一起而生成的.类型 3 UUID 使用 MD5,类型 5 UUID 使用 SHA1.只有 128 位可用,5 位用于指定类型,因此所有散列位都不会进入 UUID.(此外,MD5 被认为是加密损坏的,而 SHA1 已接近尾声,因此不要使用它来验证需要非常安全"的数据).也就是说,它为您提供了一种创建可重复/可验证的散列"函数的方法,将可能的分层名称映射到概率唯一的 128 位值,可能像分层散列或 MAC.

Roughly speaking, a type 3 or type 5 UUID is generated by hashing together a namespace identifier with a name. Type 3 UUIDs use MD5 and type 5 UUIDs use SHA1. Only 128-bits are available and 5 bits are used to specify the type, so all of the hash bits don't make it into the UUID. (Also MD5 is considered cryptographically broken, and SHA1 is on its last legs, so don't use this to verify data that needs to be "very secure"). That said, it gives you a way of creating a repeatable/verifiable "hash" function mapping a possibly hierarchical name onto a probabilistically unique 128-bit value, potentially acting like a hierarchical hash or MAC.

假设你有一个 (key,value) 存储,但它只支持一个命名空间.您可以使用类型 3 或类型 5 UUID 生成大量不同的逻辑命名空间.首先,为每个命名空间创建一个根 UUID.这可以是类型 1(主机 + 时间戳)或类型 4(随机)UUID,只要您将其存放在某处即可.或者,您可以为根创建一个随机 UUID(或使用 null UUID:00000000-0000-0000-0000-0000000000000 作为根)然后使用uuid -v5 $ROOTUUID $NAMESPACENAME"为每个命名空间创建一个可重现的 UUID.现在,您可以使用uuid -v5 $NAMESPACEUUID $KEY"为命名空间内的键创建唯一的 UUID.这些 UUID 可以放入单个键值存储中,避免冲突的可能性很高.这个过程可以递归地重复,例如,如果与 UUID 键关联的值"又代表某种逻辑命名空间",如存储桶、容器或目录,那么它的 UUID 可以依次用于生成更多层次结构UUID.

Suppose you have a (key,value) store, but it only supports one namespace. You can generate a large number of distinct logical namespaces using type 3 or type 5 UUIDs. First, create a root UUID for each namespace. This could be a type 1 (host+timestamp) or type 4 (random) UUID so long as you stash it somewhere. Alternatively you could create one random UUID for your root (or use the null UUID: 00000000-0000-0000-0000-000000000000 as root) and then create a reproducible UUID for each namespace using "uuid -v5 $ROOTUUID $NAMESPACENAME". Now you can create unique UUIDs for keys within a namespace using "uuid -v5 $NAMESPACEUUID $KEY". These UUIDs can be thrown into a single key-value store with high probability of avoiding collision. This process can be repeated recursively so that if for instance the "value" associated with a UUID key in turn represents some sort of logical "namespace" like a bucket, container or directory, then its UUID can be used in turn to generate more hierarchical UUIDs.

生成的类型 3 或类型 5 UUID 包含命名空间 id 和 name-within-namespace(键)的(部分)散列.它不再持有命名空间 UUID,就像消息 MAC 持有它编码的消息的内容一样.从 uuid 算法的角度来看,该名称是一个任意"(八位字节)字符串.然而,它的含义取决于您的应用程序.它可以是逻辑目录中的文件名、对象存储中的对象 ID 等.

The generated type 3 or type 5 UUID holds a (partial) hash of the namespace id and name-within-namespace (key). It no more holds the namespace UUID than does a message MAC hold the contents of the message it is encoded from. The name is an "arbitrary" (octet) string from the perspective of the uuid algorithm. Its meaning however depends on your application. It could be a filename within a logical directory, object-id within an object-store, etcetera.

虽然这适用于中等数量的命名空间和键,但如果您的目标是数量非常多且概率非常高的唯一键,它最终会失去动力.生日问题(又名生日悖论)的维基百科条目包括一个表格,该表格给出了不同数量的键和表格大小的至少一次碰撞的概率.对于 128 位,以这种方式散列 260 亿个密钥的冲突概率为 p=10^-18(可以忽略不计),但是 26 万亿个密钥,增加了至少一个冲突的概率为 p=10^-18>p=10^-12(万亿分之一),对 26*10^15 键进行散列,将至少一次碰撞的概率增加到 p=10^-6(百万分之一).调整编码 UUID 类型的 5 位,它会用得更快一些,所以一万亿个键大约有万亿分之一的几率发生一次碰撞.

While this works well for a moderately large number of namespaces and keys, it eventually runs out of steam if you are aiming for a very large numbers of keys that are unique with very high probability. The Wikipedia entry for the Birthday Problem (aka Birthday Paradox) includes a table that gives the probabilities of at least one collision for various numbers of keys and table sizes. For 128-bits, hashing 26 billion keys this way has a probability of collision of p=10^-18 (negligible), but 26 trillion keys, increases the probability of at least one collision to p=10^-12 (one in a trillion), and hashing 26*10^15 keys, increases the probability of at least one collision to p=10^-6 (one in a million). Adjusting for 5 bits that encode the UUID type, it will run out somewhat faster, so a trillion keys have roughly a 1-in-a-trillion chance of having a single collision.

参见http://en.wikipedia.org/wiki/Birthday_problem#Probability_table用于概率表.

参见 http://www.ietf.org/rfc/rfc4122.txt有关 UUID 编码的更多详细信息.

See http://www.ietf.org/rfc/rfc4122.txt for more details on UUID encodings.

这篇关于生成 v5 UUID.什么是名称和命名空间?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆