不同文件大小的散列冲突与文件大小相同吗? [英] Are hash collisions with different file sizes just as likely as same file size?

查看:191
本文介绍了不同文件大小的散列冲突与文件大小相同吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在散列大量文件,为了避免散列冲突,我还存储了一个文件的原始大小 - 即使存在散列冲突,文件大小也不会相同。这听起来是什么(哈希碰撞同样有可能是任何大小),还是我需要另一条信息(如果碰撞更可能与原始长度相同)。



或者更一般地说:不管原始文件的大小如何,每个文件是否可能产生特定的散列?解决方案 / div>

取决于你的散列函数,但通常情况下,大小相同但内容不同的文件不太可能产生与大小不同的文件相同的散列。尽管如此,仅仅使用具有更大空间的经过时间检验的哈希值(例如,MD5而不是CRC32,或者SHA1而不是MD5),可能会比在存储文件大小等自己的解决方案上下注更多。


I'm hashing a large number of files, and to avoid hash collisions, I'm also storing a file's original size - that way, even if there's a hash collision, it's extremely unlikely that the file sizes will also be identical. Is this sound (a hash collision is equally likely to be of any size), or do I need another piece of information (if a collision is more likely to also be the same length as the original).

Or, more generally: Is every file just as likely to produce a particular hash, regardless of original file size?

解决方案

Depends on your hash function, but in general, files that are of the same size but different content are less likely to produce the same hash as files that are of different size. Still, it would probably be cleaner to simply use a time-tested hash with a larger space (e.g. MD5 instead of CRC32, or SHA1 instead of MD5) than bet on your own solutions like storing file size.

这篇关于不同文件大小的散列冲突与文件大小相同吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆