SHA是否足以检查文件重复? (sha1_file在PHP中) [英] Is SHA sufficient for checking file duplication? (sha1_file in PHP)

查看:753
本文介绍了SHA是否足以检查文件重复? (sha1_file在PHP中)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设你想制作一个文件托管网站供人们上传他们的文件,并发送一个链接给他们的朋友,以便以后检索,你想确保文件在我们存储的地方重复,PHP的sha1_file足够好的任务?有没有理由不使用md5_file?



对于前端,使用原始文件名存储在数据库中将会被隐藏,但是一些额外的问题将是if这将揭示关于原始海报的任何内容。文件是否继承了上一次修改的任何元信息,或者发布了它的文件系统中的这些东西?



另外,由于安全性使用了盐无聊关于彩虹表攻击对此没有任何意义,哈希可以稍后用作校验和?



最后一件事,可扩展性?最初,它只会用于小文件几个大的,但最终...



编辑1:散列的要点主要是为了避免文件重复,不要造成晦涩。

解决方案

根据我对ykaganovich的回答的评论,sha1(令人惊讶)比md5稍快。 / p>

从您对此问题的描述中,您不会尝试创建安全散列 - 仅将文件隐藏在大型命名空间中 - 在这种情况下使用salt / rainbow table是无关紧要的 - 唯一的考虑是错误碰撞的可能性(其中2个不同的文件给出相同的哈希)。 md5发生这种情况的可能性非常非常遥远。它与sha1更遥远。但是,您需要考虑当两个独立用户将相同的warez上传到您的站点时会发生什么。谁拥有该文件?



事实上,根本看来没有任何理由使用哈希 - 只需生成一个足够长的随机值。



C。


Suppose you wanted to make a file hosting site for people to upload their files and send a link to their friends to retrieve it later and you want to insure files are duplicated where we store them, is PHP's sha1_file good enough for the task? Is there any reason to not use md5_file instead?

For the frontend, it'll be obscured using the original file name store in a database but some additional concerns would be if this would reveal anything about the original poster. Does a file inherit any meta information with it like last modified or who posted it or is this stuff based in the file system?

Also, is using a salt frivolous since security in regards of rainbow table attack mean nothing to this and the hash could later be used as a checksum?

One last thing, scalability? initially, it's only going to be used for small files a couple of megs big but eventually...

Edit 1: The point of the hash is primarily to avoid file duplication, not to create obscurity.

解决方案

As per my comment on ykaganovich's answer, sha1 is (surprisingly) slightly faster than md5.

From your description of the problem, you are not trying to create a secure hash - merely hide the file in a large namespace - in which case use of a salt / rainbow tables are irrelevant - the only consideration is the likelihood of a false collision (where 2 different files give the same hash). The probability of this happening with md5 is very, very remote. It's even more remote with sha1. However you do need to think about what happens when 2 independent users upload the same warez to you site. Who owns the file?

In fact, there doesn't seem to be any reason at all to use a hash - just generate a sufficiently long random value.

C.

这篇关于SHA是否足以检查文件重复? (sha1_file在PHP中)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆