SHA-1 对密码存储是否安全? [英] Is SHA-1 secure for password storage?

查看:17
本文介绍了SHA-1 对密码存储是否安全?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

结论: SHA-1 与任何对抗原像攻击的东西一样安全,但它易于计算,这意味着更容易发起暴力破解或字典攻击.(对于像 SHA-256 这样的继承者也是如此.)根据具体情况,设计为计算量大的哈希函数(例如 bcrypt)可能是更好的选择.

Conclusion: SHA-1 is as safe as anything against preimage attacks, however it is easy to compute, which means it is easier to mount a bruteforce or dictionary attack. (The same is true for successors like SHA-256.) Depending on the circumstances, a hash function which was designed to be computationally expensive (such as bcrypt) might be a better choice.

有些人经常抛出诸如SHA-1 已损坏"之类的言论,所以我试图理解这究竟是什么意思.假设我有一个 SHA-1 密码哈希数据库,一个拥有最先进的 SHA-1 破解算法和一个拥有 100,000 台机器的僵尸网络的攻击者可以访问它.(拥有超过 10 万台家用电脑的控制权意味着他们每秒可以执行大约 10^15 次操作.)他们需要多长时间

Some people throw around remarks like "SHA-1 is broken" a lot, so I'm trying to understand what exactly that means. Let's assume I have a database of SHA-1 password hashes, and an attacker whith a state of the art SHA-1 breaking algorithm and a botnet with 100,000 machines gets access to it. (Having control over 100k home computers would mean they can do about 10^15 operations per second.) How much time would they need to

  1. 找出任何一个用户的密码?
  2. 找出给定用户的密码?
  3. 找出所有用户的密码?
  4. 想办法以用户之一的身份登录?
  5. 找到一种以特定用户身份登录的方法?

如果密码被加盐,情况会如何改变?加盐的方法(前缀,后缀,两者,或更复杂的东西,如异或)重要吗?

How does that change if the passwords are salted? Does the method of salting (prefix, postfix, both, or something more complicated like xor-ing) matter?

这是我目前的理解,经过一些谷歌搜索.如果我误解了什么,请在答案中更正.

Here is my current understanding, after some googling. Please correct in the answers if I misunderstood something.

  • 如果没有盐,彩虹攻击会立即找到所有密码(极长密码除外).
  • 如果有足够长的随机盐,找出密码的最有效方法是暴力破解或字典攻击.碰撞和原像攻击都无助于找出实际密码,因此针对 SHA-1 的加密攻击在这里无济于事.使用什么算法甚至都无关紧要 - 甚至可以使用 MD5 或 MD4 并且密码同样安全(存在细微差别,因为计算 SHA-1 哈希值较慢).
  • 为了评估同样安全"的安全性,我们假设一次 sha1 运行需要 1000 次操作,并且密码包含大写、小写和数字(即 60 个字符).这意味着攻击者每天可以测试 1015*60*60*24/1000 ~= 1017 个潜在密码.对于蛮力攻击,这意味着在 3 小时内测试最多 9 个字符、每周最多 10 个字符、一年最多 11 个字符的所有密码.(每增加一个字符需要 60 倍的时间.)字典攻击要快得多(即使是只有一台计算机的攻击者也可以在数小时内完成),但只能找到弱密码.
  • 以用户身份登录,攻击者不需要找出确切的密码;找到一个产生相同哈希的字符串就足够了.这称为第一次原像攻击.据我所知,没有针对 SHA-1 的原像攻击.(蛮力攻击需要 2160 次操作,这意味着我们的理论攻击者需要 1030 年才能成功.理论上可能性的极限约为 260 操作,攻击需要几年时间.)有 针对缩减版 SHA-1 的原像攻击,影响可忽略不计(对于使用 44 步而不是 80 步的缩减版 SHA-1,攻击时间从 2160 操作减少到 2157).针对 SHA-1 的碰撞攻击完全在理论上的可能性(我发现的最好的 将时间从 280 降低到 252),但这些都是对密码哈希无用,即使没有加盐.
  • If there is no salt, a rainbow attack will immediately find all passwords (except extremely long ones).
  • If there is a sufficiently long random salt, the most effective way to find out the passwords is a brute force or dictionary attack. Neither collision nor preimage attacks are any help in finding out the actual password, so cryptographic attacks against SHA-1 are no help here. It doesn't even matter much what algorithm is used - one could even use MD5 or MD4 and the passwords would be just as safe (there is a slight difference because computing a SHA-1 hash is slower).
  • To evaluate how safe "just as safe" is, let's assume that a single sha1 run takes 1000 operations and passwords contain uppercase, lowercase and digits (that is, 60 characters). That means the attacker can test 1015*60*60*24 / 1000 ~= 1017 potential password a day. For a brute force attack, that would mean testing all passwords up to 9 characters in 3 hours, up to 10 characters in a week, up to 11 characters in a year. (It takes 60 times as much for every additional character.) A dictionary attack is much, much faster (even an attacker with a single computer could pull it off in hours), but only finds weak passwords.
  • To log in as a user, the attacker does not need to find out the exact password; it is enough to find a string that results in the same hash. This is called a first preimage attack. As far as I could find, there are no preimage attacks against SHA-1. (A bruteforce attack would take 2160 operations, which means our theoretical attacker would need 1030 years to pull it off. Limits of theoretical possibility are around 260 operations, at which the attack would take a few years.) There are preimage attacks against reduced versions of SHA-1 with negligible effect (for the reduced SHA-1 which uses 44 steps instead of 80, attack time is down from 2160 operations to 2157). There are collision attacks against SHA-1 which are well within theoretical possibility (the best I found brings the time down from 280 to 252), but those are useless against password hashes, even without salting.

简而言之,使用 SHA-1 存储密码似乎非常安全.我错过了什么吗?

In short, storing passwords with SHA-1 seems perfectly safe. Did I miss something?

更新: Marcelo 指出一篇文章提到 2 中的第二次原像攻击106 个操作.(Thomas 解释说,这种攻击是一种假设结构,不适用于现实生活场景.)不过,我仍然不明白这对使用 SHA-1 作为密钥派生函数有何危险.是否有充分的理由认为碰撞攻击或第二次原像攻击最终可以转变为第一原像攻击?

Update: Marcelo pointed out an article which mentions a second preimage attack in 2106 operations. ( As Thomas explains, this attack is a hypothetical construct which does not apply to real-life scenarios.) I still don't see how this spells danger for the use of SHA-1 as a key derivation function, though. Are there generally good reasons to think that a collision attack or a second preimage attack can be eventually turned into a first preimage attack?

推荐答案

对您的问题的简短回答是:SHA-1 尽可能安全.MD5 也可以,甚至 MD4;但这可能会让一些投资者感到紧张.对于公共关系,最好使用更好"的哈希函数,例如SHA-256,即使您将其输出截断为 160 或 128 位(以节省存储成本).一些 SHA-3 第 2 轮候选人 似乎比 SHA-1 更快,同时可以说更安全";但它们仍然有点新,所以现在坚持使用 SHA-256 或 SHA-512 将是一条更安全的路线.这会让你看起来很专业和谨慎,这很好.

The short answer to your question is: SHA-1 is as secure as you can get. MD5 would be fine too, even MD4; but it could make some investors nervous. For public relations, it is best to use a "better" hash function, e.g. SHA-256, even if you truncate its output to 160 or 128 bits (to save on storage cost). Some of the SHA-3 round-2 candidates appear to be faster than SHA-1 while being arguably "more secure"; yet they are still a bit new, so sticking to SHA-256 or SHA-512 would be a safer route right now. It would make you look professional and cautious, which is good.

请注意,尽可能安全"与绝对安全"不同.请参阅下文以获得相当冗长的解释.

Note that "as secure as you can get" is not the same as "perfectly safe". See below for rather lengthy explanations.

关于已知攻击:

对 MD4、MD5 和 SHA-1 的已知攻击是关于碰撞的,不会影响原像抗性.已经证明 MD4 有一些弱点可以(仅在理论上)在尝试破解 HMAC/MD4 时被利用,但这不适用于您的问题.Kesley 和 Schneier 的论文中的 2106 秒原像攻击是一种通用权衡,仅适用于非常长的输入(260 字节;即一百万 TB -- 注意 106+60 是如何超过 160 的;这就是你看到权衡没有任何魔力的地方.

The known attacks on MD4, MD5 and SHA-1 are about collisions, which do not impact preimage resistance. It has been shown that MD4 has a few weaknesses which can be (only theoretically) exploited when trying to break HMAC/MD4, but this does not apply to your problem. The 2106 second preimage attack in the paper by Kesley and Schneier is a generic trade-off which applies only to very long inputs (260 bytes; that's a million terabytes -- notice how 106+60 exceeds 160; that's where you see that the trade-off has nothing magic in it).

此消息的其余部分假定您使用的哈希函数(例如 SHA-1)是一个黑匣子",没有攻击者可能使用的特殊属性.即使使用损坏的"哈希函数 MD5 和 SHA-1,这也是您现在所拥有的.

The rest of this message assumes that the hash function you use (e.g. SHA-1) is a "black box" with no special property that the attacker may use. That's what you have right now even with the "broken" hash functions MD5 and SHA-1.

关于彩虹桌:

彩虹攻击"实际上是字典或蛮力攻击的成本分摊.它是 时间的派生词- 内存权衡 由 Hellman 在 1980 年首次描述.假设您有 N 个可能的密码(即字典的大小,或 2n 如果你考虑暴力破解一个输出为 n 位的散列函数),有一个分时攻击,你预先计算了 N 个散列密码并将它们存储在一个大表中.如果您对哈希输出进行排序,您可以在一次查找中获取您的密码.rainbow table 是存储该表的一种智能方式,并且空间大大减少.您只存储 N/t 个散列密码,并使用 O(t2) 次查找来破解密码.Rainbow 表允许您以虚拟方式处理比实际存储容量大得多的预计算表.

The "rainbow attack" is actually cost-sharing of a dictionary or brute force attack. It is a derivative from the time-memory trade-off first described by Hellman in 1980. Assuming that you have N possible passwords (that's the size of your dictionary, or 2n if you consider brute-forcing a hash function with an output of n bits), there is a time-sharing attack in which you precompute the N hashed passwords and store them in a big table. If you sort the hash outputs, you can get your password in a single lookup. A rainbow table is a smart way to store that table with a much reduced space. You store only N/t hashed passwords, and you crack passwords with O(t2) lookups. Rainbow tables allow you to virtually handle precomputed tables much larger than what you can realistically store.

但是,无论彩虹与否,攻击者仍然必须至少运行一次完整攻击.这可以看作是几个连续的优化层:

However, rainbow or not, the attacker still has to run the full attack at least once. This can be viewed as several successive optimization layers:

  1. 暴力破解/字典攻击已花费 N 来破解每个密码.
  2. 使用预先计算的表格,攻击者支付 N 一次的成本,然后可以以非常小的额外成本攻击 许多 个密码密码.
  3. 如果预先计算的表是彩虹表,那么 N 可以稍微大一些,因为 存储 成本降低了.N 上的瓶颈在于攻击者可以收集的 CPU 能力,而不是他的硬盘大小.
  1. The brute-force / dictionary attack has cost N for cracking each password.
  2. With a pre-computed table, the attacker pays that cost N once and can thereafter attack many passwords with very small extra cost per password.
  3. If the pre-computed table is a rainbow table, then N can be somewhat bigger, because storage cost is reduced. The bottleneck on N becomes the CPU power that the attacker can muster, not the size of his harddisks.

如果 N 足够大,以至于散列 N 个密码的 CPU 成本是可笑的,那么这种攻击是不可行的,无论是否使用彩虹表或不是.这意味着具有 80 位或更多位输出的(抗原像的)哈希函数足以使暴力攻击不可行.

If N is large enough that the CPU-cost of hashing N passwords is ludicrous, then such an attack is not feasible, regardless of whether rainbow tables are used or not. This means that a (preimage-resistant) hash function with an output of 80 bits or more is enough to make brute-force attack infeasible.

关于盐:

盐是一种击败预计算的方法.在上面的描述中,加盐将攻击者带回到第 1 步:加盐可防止攻击者在多个被攻击密码之间分摊 O(N) 成本.预先计算的表,更何况彩虹表,不再可行.

Salts are a way to defeat pre-computations. In the description above, the salt brings back the attacker to step 1: salting prevents the attacker from sharing the O(N) cost between several attacked passwords. Pre-computed tables, a fortiori rainbow tables, are no longer feasible.

您想要加盐,因为当散列数据包含在 密码 中时,即适合随机人类大脑的东西,那么 N 可能非常低:人类在选择和记住密码方面真的很糟糕.这就是字典攻击"的含义:假设许多用户密码将位于该特别选择的空间中,从而使用减少的潜在密码空间(字典").

You want salting because when the hashed data consists in passwords, i.e. something which fits within the brain of a random human being, then N can be quite low: humans are really bad at choosing and remembering passwords. This is what "dictionary attacks" are about: that's using a reduced space of potential passwords (the "dictionary") under the assumption that many user passwords will be in that specially selected space.

因此,加盐至少可以防止攻击者使用预先计算的表,尤其是预先计算的彩虹表.这假设攻击者能够破解一两个密码;我们不希望他以很少的额外开销破解 1000 个其他密码.

Hence salting will at least prevent the attacker from using pre-computed tables, in particular pre-computed rainbow tables. This assumes that the attacker will be able to break one password or two; we do not want him to break 1000 other passwords with little extra overhead.

另外,加盐对公共关系有好处.

Also, salting is good for public relations.

关于 SHA-1 成本:

SHA-1 的基本成本是散列一个 64 字节的块.这就是 SHA-1 的工作原理:数据被填充,然后分成 64 字节的块.在 Intel Core2 系统上,处理单个块的成本约为 500 个时钟周期,而这仅针对单个内核.MD5 和 MD4 更快,分别计算大约 400 和 250 个周期.不要忘记大多数现代 CPU 都有多个内核,因此请相应地增加.

The elementary cost of SHA-1 is about hashing a 64-byte block. That's how SHA-1 works: data is padded, then split into 64-byte blocks. The cost of processing a single block is about 500 clock cycles on an Intel Core2 system, and that's for a single core. MD5 and MD4 are faster, count about 400 and 250 cycles, respectively. Do not forget that most modern CPU have several cores, so multiply accordingly.

一些加盐方案规定了大量的盐;例如进入散列函数的实际上是单个 128 位盐的 40000 个连续副本,然后是密码本身.对于合法用户和攻击者来说,这使得密码散列更加昂贵(在我的示例中是 10000 倍).这是否是一个好主意取决于设置.对于在桌面系统上登录,这很好:用户甚至不会注意到哈希他的密码需要 10 毫秒,而不是 1 微秒;但是攻击者的成本已经增加了非常明显的 10000 倍.在每秒有数千个客户端的共享服务器上,总成本可能会变得令人望而却步.从概念上讲,将合法用户和攻击者的标准提高相同的因素最终并不是好的安全性;但在某些特定情况下,这可能是一个值得的想法.

Some salting schemes prescribe huge salts; e.g. what enters the hash function is actually 40000 successive copies of a single 128-bit salt, followed by the password itself. This makes password hashing more expensive (by a factor of 10000 with my example), both for the legitimate user and for the attacker. Whether this is a good idea depends on the setup. For login on a desktop system, this is good: the user will not even notice that it took 10ms to hash his password, instead of 1µs; but the cost for the attacker has risen by a very noticeable factor 10000. On shared servers with thousands of clients per second, the aggregate cost may become prohibitive. Conceptually, raising the bar by the same factor for the legitimate user and the attacker is not ultimately good security; but it can be a worthwhile idea in some specific situations.

关于在线攻击:

以上所有内容都是关于击败离线攻击.离线攻击是攻击者拥有测试"密码所需的所有数据的攻击;例如攻击者可以获得保存散列密码的数据库副本.在离线攻击中,攻击者仅受其计算资源的限制.相反,online 攻击是一种攻击,攻击者的每次猜测都必须通过诚实的验证者(例如,攻击者只是尝试登录受攻击的系统).通过强制限制每秒可以尝试的密码数量来阻止在线攻击.极端的例子是智能卡在三个错误的 PIN 后关闭.

All of the above is about defeating offline attacks. An offline attack is an attack where the attacker has all the data he needs in order to "test" passwords; e.g. the attacker could get a copy of the database holding the hashed passwords. In an offline attack, the attacker is limited only by his computational resources. Conversely, an online attack is an attack where each guess by the attacker must go through an honest verifier (e.g. the attacker simply tries to log in on the attacked system). Online attacks are thwarted by enforcing limits on how many passwords can be tried per second. Extreme examples are smartcards which shut down after three wrong PINs.

通常,为了密码安全,安排系统不让攻击者构建离线攻击会带来更多回报.这就是 Unix 系统所做的:散列密码,以前在世界可读的 /etc/password 文件中,现在在 /etc/shadow 文件中,即防止读取访问,除了少数特权应用程序.这里的假设是,如果攻击者可以读取 /etc/shadow,那么他可能对系统有足够的控制权,以至于他不再需要密码......

Usually, for password security, it pays off much more to arrange the system for not letting an attacker build an offline attack. That's what Unix systems do: the hashed passwords, which used to be in the world-readable /etc/password file, are now in the /etc/shadow file which is protected against read access, except by a few privileged applications. The assumption here is that if the attacker can read /etc/shadow, then he probably has enough control over the system that he does not really need passwords anymore...

这篇关于SHA-1 对密码存储是否安全?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆