如何使用 scrypt 在 Python 中为密码和盐生成哈希 [英] How to use scrypt to generate hash for password and salt in Python

查看:64
本文介绍了如何使用 scrypt 在 Python 中为密码和盐生成哈希的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用 scrypt 为我的用户的密码和盐创建哈希.我找到了 两个 参考资料,但有些事情我不明白.

他们使用 scrypt 加密和解密功能.一个加密随机字符串,另一个加密盐(这看起来是错误的,因为只有密码而不是盐用于解密).看起来解密函数被用来验证密码/盐作为解密的副作用.

根据我所了解的一点,我想要的是密钥派生函数 (KDF) 而不是加密/解密,并且 KDF 很可能由 scrypt 生成并用于加密/解密.实际的 KDF 是在幕后使用的,我担心盲目遵循这些示例会导致错误.如果使用 scrypt 加密/解密函数来生成和验证密码,我不明白被加密的字符串的作用.它的内容或长度重要吗?

解决方案

你说得对 - 这两个链接使用的 scrypt 函数是 scrypt 文件加密实用程序,而不是底层 kdf.我一直在缓慢地为 python 创建一个独立的基于 scrypt 的密码哈希,我自己也遇到了这个问题.

scrypt 文件实用程序执行以下操作:选择特定于您的系统的 scrypt 的 n/r/p 参数 &最小时间"参数.然后它生成一个 32 字节的 salt,然后调用 scrypt(n,r,p,salt,pwd) 来创建一个 64 字节的密钥.工具返回的二进制字符串由以下部分组成: 1) 包含 n、r、p 值和以二进制编码的盐的标头;2) 头部的 sha256 校验和;和 3) 校验和的 hmac-sha256 签名副本,使用密钥的前 32 个字节.之后,它使用密钥的剩余 32 个字节对输入数据进行 AES 加密.

我可以看到这有几个含义:

  1. 输入数据没有意义,因为它实际上并不影响正在使用的盐,并且 encrypt() 每次都会生成一个新的盐.

  2. 你不能手动配置 n,r,p 工作负载,或者除了尴尬的 min-time 参数之外的任何其他方式.这并不是不安全,而是控制工作因素的一种相当笨拙的方式.

  3. 在解密调用重新生成密钥并将其与 hmac 进行比较后,如果您的密码错误,它将拒绝那里的所有内容 - 但如果正确,它将继续解密数据包.这是攻击者不需要执行的大量额外工作 - 他们甚至不必派生 64 个字节,只需派生检查签名所需的 32 个字节.这个问题并不完全不安全,但是做攻击者不做的工作是绝对不可取的.

  4. 无法配置salt key、派生密钥大小等.当前值还不错,但仍然不理想.

  5. 解密实用程序的最大时间"限制对于密码散列是错误的 - 每次调用解密时,它都会估计您的系统速度,并对它是否可以在最大时间内计算出密钥进行一些猜测" -这是攻击者不必做的更多开销(参见#3),但这也意味着解密可能会在系统负载繁重的情况下开始拒绝密码.

  6. 我不确定为什么 Colin Percival 没有制作 kdf &公共 api 的参数选择代码部分,但实际上它在源代码中明确标记为私有" - 甚至没有导出用于链接.这让我犹豫是否直接访问它而不进行更多研究.

总而言之,需要的是一种可以存储 scrypt 的不错的散列格式,以及一种公开底层 kdf 和参数选择算法的实现.我目前正在为 passlib 研究这个,但它没有引起太多关注:(

只是为了底线 - 这些网站的说明是好的",我只是使用空字符串作为文件内容,并注意额外的开销和问题.>

I would like to use scrypt to create a hash for my users' passwords and salts. I have found two references, but there are things I don't understand about them.

They use the scrypt encrypt and decrypt functions. One encrypts a random string and the other encrypts the salt (which looks wrong since only the password and not the salt is used for decryption). It looks like the decrypt function is being used to validate the password/salt as a side effect of the decryption.

Based on the little I understand, what I want is a key derivation function (KDF) rather than encryption/decryption and that the KDF is likely generated and used by scrypt for encryption/decryption. The actual KDF is used behind the scenes and I am concerned that blindly following these examples will lead to a mistake. If the scrypt encrypt/decrypt functions are used to generate and verify the password, I don't understand the role of the string being encrypted. Does its content or length matter?

解决方案

You're correct - the scrypt functions those two links are playing with are the scrypt file encryption utility, not the underlying kdf. I've been slowly working on creating a standalone scrypt-based password hash for python, and ran into this issue myself.

The scrypt file utility does the following: picks scrypt's n/r/p parameters specific to your system & the "min time" parameter. It then generates a 32 byte salt, and then calls scrypt(n,r,p,salt,pwd) to create a 64 bytes key. The binary string the tool returns is composed of: 1) a header containing n, r, p values, and the salt encoded in binary; 2) an sha256 checksum of the header; and 3) a hmac-sha256 signed copy of the checksum, using the first 32 bytes of the key. Following that, it uses the remaining 32 bytes of the key to AES encrypt the input data.

There are a couple of implications of this that I can see:

  1. the input data is meaningless, since it doesn't actually affect the salt being used, and encrypt() generates a new salt each time.

  2. you can't configure the n,r,p workload manually, or any other way but the awkward min-time parameter. this isn't insecure, but is a rather awkward way to control the work factor.

  3. after the decrypt call regenerates the key and compares it against the hmac, it will reject everything right there if your password is wrong - but if it's right, it'll proceed to also decrypt the data package. This is a lot of extra work the attacker won't have to perform - they don't even have to derive 64 bytes, just the 32 needed to check the signature. This issue doesn't make it insecure exactly, but doing work your attacker doesn't is never desirable.

  4. there is no way to configure salt key, derived key size, etc. the current values aren't that bad, but still, it's not ideal.

  5. the decrypt utility's "max time" limitation is wrong for password hashing - each time decrypt is called, it estimates your system's speed, and does some "guessing" as to whether it can calculate the key within max time - which is more overhead your attacker doesn't have to do (see #3), but it also means decrypt could start rejecting passwords under heavy system load.

  6. I'm not sure why Colin Percival didn't make the kdf & parameter-choosing code part of the public api, but it's infact explicitly marked "private" inside the source code - not even exported for linking. This makes me hesitant to just access it straight without a lot more study.

All in all, what is needed is a nice hash format that can store scrypt, and an implementation that exposes the underlying kdf and parameter-choosing algorithm. I'm currently working on this myself for passlib, but it hasn't seen much attention :(

Just to bottom line things though - those site's instructions are 'ok', I'd just use an empty string as the file content, and be aware of the extra overhead and issues.

这篇关于如何使用 scrypt 在 Python 中为密码和盐生成哈希的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆