在Swift中生成字符串的自定义长度哈希值 [英] Generate custom length hash values of a String in Swift

查看:195
本文介绍了在Swift中生成字符串的自定义长度哈希值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否可以以某种方式将长度为n的给定String哈希为任意长度m的哈希值? 我想要实现以下目标:

Is it possible to somehow "hash" a given String with length n to a hash value of an arbitrary length m? I want to achieve something like follows:

let s1 = "<UNIQUE_USER_IDENTIFIER_1>" 
let s2 = "<UNIQUE_USER_IDENTIFIER_2>"

let x1 = s1.hashValue(length: 4) 
let x2 = s2.hashValue(length: 4) 

我想为每个给定的用户分配一个(例如四位数)编号,该编号基于其唯一的UID.有可能吗?

I want to assign each given user a (e.g. four-digit) number, that is based on its unique UID. Is that possible?

推荐答案

首先,我想弄清楚您的意思是哈希",而不是(无损)压缩".您可能会期望发生一些冲突,其中x1和x2对于不同的s1和s2是相同的值.如果您真的是要进行映射以便没有冲突,那么我们必须对这个问题有更多的了解.在一般情况下,这是不可能实现的(请参见 Pigeonhole原理).但这可以在某些特殊情况下实现,其中输入中有足够的冗余.或者可以通过维护表(即数据库等)来完成.这个答案的其余部分是关于散列的.

First, I want to be clear that you mean "hash" and don't mean "(lossless) compress." You should expect some collisions where x1 and x2 are the same value for different s1 and s2. If you really mean a mapping so that there are no collisions, then we have to know a lot more about the problem. It is impossible to achieve that in the general case (see the Pigeonhole principle). But it can be achieved in some special cases where there is sufficient redundancy in the input. Or it can be done by maintaining a table (i.e. a database or the like). The rest of this answer is about hashing.

如果您的UID是在iOS上创建的UUID(或任何v4 UUID),则其位已经是相当高的质量,并且最后的四位数应该很好,而无需进行任何哈希处理.中间应该避免几个字节,但是整个结尾部分是随机的,因此是理想的散列.

If your UID is a UUID created on iOS (or any v4 UUID), then its bits are already quite high quality, and the last four digits should be fine without doing any hashing at all. There are a couple of bytes in the middle that you should avoid, but the whole end section is random and so an ideal hash.

如果您的UUID不是随机的,则可以尝试使用默认的哈希并从中提取所需的位数,但是非加密哈希在它们的位之间并不总是具有良好的独立性,因此,这种冲突可能会比你喜欢.

If your UUID is not random, you can try using the default hashes and pulling the required number of bits out of them, but non-cryptographic hashes don't always have good independence between their bits, so this may collide more than you like.

在这种情况下,请使用大于所需大小的加密哈希并将其截断(或采用最低有效位;可以将其中任一设置都可以).这通常在密码学中完成.例如,SHA-512/256是一种常用的哈希,它计算512位哈希并从中提取256位.加密散列要求其所有位具有高度独立性,因此,任何位子集也将具有抗冲突性.

In that case use a cryptographic hash larger than the size you need and truncate it (or take the least-significant bits; either set are fine). This is commonly done in cryptography. For example SHA-512/256 is a commonly used hash that computes a 512-bit hash and extracts 256 bits from it. Cryptographic hashes require high independence of all their bits, so any subset of bits will also be collision resistant.

顺便说一句,如果您的意思是"4个十进制数字",那么您应该期望在100中发生1次碰撞.如果您的意思是16位(4个十六进制数字),您应该期望在300中碰撞一次.这些是您的最佳情况,并意味着您的哈希工作良好.有关期望值和一些有用的近似值,请参见生日攻击.

BTW, if you mean "4 decimal digits," then you should expect a collision about 1 time out 100. If you mean 16 bits (4 hex digits), you should expect a collision about one time in 300. These are your best-case scenarios and mean your hash is working well. See Birthday Attack for a table of expectations and some helpful approximations.

这篇关于在Swift中生成字符串的自定义长度哈希值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆