在Java环境中检测重复上传的文件的最佳方法? [英] Best way to detect duplicate uploaded files in a Java Environment?

查看:420
本文介绍了在Java环境中检测重复上传的文件的最佳方法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

作为基于Java的网络应用程序的一部分,我将接受上传的.xls& .csv(也可能是其他类型的)文件。每个文件将通过参数和时间戳的组合进行唯一的重命名。

As part of a Java based web app, I'm going to be accepting uploaded .xls & .csv (and possibly other types of) files. Each file will be uniquely renamed with a combination of parameters and a timestamp.

我想能够识别任何重复的文件。通过复制我的意思是,完全相同的文件,不管名称。理想情况下,我希望能够在上传后尽可能快地检测重复项,以便服务器可以在响应中包含此信息。 (如果文件大小的处理时间不会造成太大的滞后。)

I'd like to be able to identify any duplicate files. By duplicate I mean, the exact same file regardless of the name. Ideally, I'd like to be able to detect the duplicates as quickly as possible after the upload, so that the server could include this info in the response. (If the processing time by file size doesn't cause too much of a lag.)

我已经阅读了关于在文件上运行MD5并将结果存储为唯一钥匙等... <罢工>但我怀疑有一个更好的方法。(有更好的方法吗?)

I've read about running MD5 on the files and storing the result as unique keys, etc... but I've got a suspicion that there's a much better way. (Is there a better way?)

任何关于如何最好地接近这方面的建议是值得赞赏的。

Any advice on how best to approach this is appreciated.

谢谢。

更新: strong>
我根本没反对使用MD5。过去我曾经用过Perl( Digest :: MD5 )。我认为在Java世界中,另一个(更好的)解决方案可能已经出现。但是,看起来我是错误的。

UPDATE: I have nothing at all against using MD5. I've used it a few times in the past with Perl (Digest::MD5). I thought that in the Java world, another (better) solution might have emerged. But, it looks like I was mistaken.

感谢大家的答案和评论。对于现在使用MD5,我感觉很不错。

Thank you all for the answers and comments. I'm feeling pretty good about using MD5 now.

推荐答案

在处理上传的文件时,装饰 OutputStream DigestOutputStream ,以便您可以在写入时计算文件的摘要。将最终摘要存储在文件的唯一标识符(在十六进制中,作为文件名的一部分也许是)。

While processing uploaded files, decorate the OutputStream with a DigestOutputStream so that you can calculate the digest of the file while writing. Store the final digest somewhere along with the unique identifier of the file (in hex as part of filename maybe?).

这篇关于在Java环境中检测重复上传的文件的最佳方法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆