是否可能获得相同的SHA1散列? [英] Is it possible to get identical SHA1 hash?

查看:359
本文介绍了是否可能获得相同的SHA1散列?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

给定两个不同的串S1和S2(S1!= S2)是可能的:

Given two different strings S1 and S2 (S1 != S2) is it possible that:

SHA1(S1) == SHA1(S2)

是True?


  1. 如果是 - 有什么概率?

  2. 如果没有 - 为什么不?

  3. 输入字符串的长度上有一个上限, ? OR是独立于字符串长度的SHA1的计算(因此重复的概率)

我试图实现的目标是散列一些敏感的ID字符串(可能与一些其他字段,如父ID连接在一起),以便我可以使用散列值作为ID(例如在数据库中)。

The goal I am trying to achieve is to hash some sensitive ID string (possibly joined together with some other fields like parent ID), so that I can use the hash value as an ID instead (for example in the database).

示例:

Resource ID: X123
Parent ID: P123

我不想暴露我的资源的性质,允许客户端看到X123-P123。

I don't want to expose the nature of my resource identifies to allow client to see "X123-P123".

相反,我想创建一个新的列哈希(X123-P123),让我们说它是AAAZZZ。然后客户端可以请求ID为AAAZZZ的资源,而不知道我的内部ID等。

Instead I want to create a new column hash("X123-P123"), let's say it's AAAZZZ. Then the client can request resource with id AAAZZZ and not know about my internal id's etc.

推荐答案

em>碰撞。冲突必然存在,因为SHA-1接受许多更多不同的消息作为输入,它可以产生不同的输出(SHA-1可以吃任何高达2 ^ 64位的位串,但是只输出160位;因此,至少一个输出值必须弹出几次)。这个观察适用于输出小于其输入的任何函数,无论该函数是否是好的散列函数。

What you describe is called a collision. Collisions necessarily exist, since SHA-1 accepts many more distinct messages as input that it can produce distinct outputs (SHA-1 may eat any string of bits up to 2^64 bits, but outputs only 160 bits; thus, at least one output value must pop up several times). This observation is valid for any function with an output smaller than its input, regardless of whether the function is a "good" hash function or not.

假设SHA-1行为类似随机oracle(一个概念对象,它基本上返回随机值,唯一的限制是一旦它在输入 m 上返回输出 v 在输入 上返回 ),则对于任何两个不同的串S1和S2,冲突概率应为2 ^( - 160)。仍然假设SHA-1表现得像一个随机的oracle,如果你收集许多输入字符串,那么你将开始观察到碰撞后收集了大约2 ^ 80这样的字符串。

Assuming that SHA-1 behaves like a "random oracle" (a conceptual object which basically returns random values, with the sole restriction that once it has returned output v on input m, it must always thereafter return v on input m), then the probability of collision, for any two distinct strings S1 and S2, should be 2^(-160). Still under the assumption of SHA-1 behaving like a random oracle, if you collect many input strings, then you shall begin to observe collisions after having collected about 2^80 such strings.

(这是2 ^ 80而不是2 ^ 160因为,用2 ^ 80字符串,你可以使约2 ^ 159对字符串。这通常被称为生日悖论,因为它对大多数人来说是一个惊喜应用于生日的碰撞。请参阅有关主题的维基百科页面。)

(That's 2^80 and not 2^160 because, with 2^80 strings you can make about 2^159 pairs of strings. This is often called the "birthday paradox" because it comes as a surprise to most people when applied to collisions on birthdays. See the Wikipedia page on the subject.)

现在我们强烈怀疑SHA-1不是真正的行为像一个随机的oracle,因为生日悖论方法是一个随机oracle的最佳冲突搜索算法。然而,有一个已发布的攻击,应该在约2 ^ 63步骤中发现冲突,因此比生日悖论算法快2 ^ 17 = 131072倍。这样的攻击不应该对真正的随机oracle是可行的。注意,这次攻击没有实际完成,它仍然是理论上的(有些人试过,但显然找不到足够的CPU功率)。然而,理论看起来很健康,它似乎SHA-1不是一个随机的oracle。

Now we strongly suspect that SHA-1 does not really behave like a random oracle, because the birthday-paradox approach is the optimal collision searching algorithm for a random oracle. Yet there is a published attack which should find a collision in about 2^63 steps, hence 2^17 = 131072 times faster than the birthday-paradox algorithm. Such an attack should not be doable on a true random oracle. Mind you, this attack has not been actually completed, it remains theoretical (some people tried but apparently could not find enough CPU power). Yet, the theory looks sound and it really seems that SHA-1 is not a random oracle. Correspondingly, as for the probability of collision, well, all bets are off.

对于第三个问题:对于具有 n 的函数,如果可以输入多于2 ^个 n 个不同的消息,即如果最大输入消息长度大于 n ,则必然存在冲突。使用 m 低于 n ,答案并不那么容易。如果函数表现为随机的oracle,则碰撞存在的概率随着m 而不是线性地下降,而在m = n / 2 >。这是与生日悖论相同的分析。对于SHA-1,这意味着如果 m < 80 ,那么很可能没有碰撞,而 m> 80 使得至少一个碰撞的存在非常可能( m> 160 确定性)。

As for your third question: for a function with a n-bit output, then there necessarily are collisions if you can input more than 2^n distinct messages, i.e. if the maximum input message length is greater than n. With a bound m lower than n, the answer is not as easy. If the function behaves as a random oracle, then the probability of the existence of a collision lowers with m, and not linearly, rather with a steep cutoff around m=n/2. This is the same analysis than the birthday paradox. With SHA-1, this means that if m < 80 then chances are that there is no collision, while m > 80 makes the existence of at least one collision very probable (with m > 160 this becomes a certainty).

请注意,存在碰撞和您发现碰撞之间存在差异。即使存在碰撞,你仍然有你的2 ^( - 160)每次你尝试的概率。前面的段落意味着,如果你不能(概念上)尝试2 ^ 160对字符串,这样的概率是相当无意义的。因为你限制自己的字符串小于80位。

Note that there is a difference between "there exists a collision" and "you find a collision". Even when a collision must exist, you still have your 2^(-160) probability every time you try. What the previous paragraph means is that such a probability is rather meaningless if you cannot (conceptually) try 2^160 pairs of strings, e.g. because you restrict yourself to strings of less than 80 bits.

这篇关于是否可能获得相同的SHA1散列?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆