上传的文件应该重新命名? [英] Should uploaded files be renamed?

查看:151
本文介绍了上传的文件应该重新命名?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在阅读PHP文件上传的安全性,有几篇文章建议重命名这些文件。例如,OWASP文章无限制文件上传
说:


建议使用算法来确定文件名。对于
实例,文件名可以是文件名称的MD5哈希值加上当天的
日期。

如果用户上传一个名为的文件,Cake Recipe.doc 是否有任何理由将其重命名为 45706365b7d5b1f35 如果答案是肯定的,无论出于何种原因,那么你如何跟踪原始文件名和扩展名?



  1. 安全性 - 如果您的应用程序编写不佳,允许通过名称下载文件或通过直接访问(这是一个可怕的,但它发生),对于用户来说,要想猜测文件的名称是非常困难的。
  2. 唯一性 - 李两个不同的人上传同名文件的可能性非常高(即, avatar.gif,readme.txt,video.avi等)。使用唯一的标识符可以显着降低两个文件具有相同名称的可能性。
  3. 版本控制 - 使用唯一名称保留文档的多个版本 。它还避免了需要额外的代码来解析文件名以进行更改。一个简单的例子就是将document.pdf记录到文档(1).pdf中,当你不低估用户为事物创建可怕的名字的能力时,这变得更加复杂。
  4. 长度 - 使用已知的文件名长度总是比使用未知的文件名长度更好。我可以总是知道(我的文件路径)+(X字母)是一定的长度,其中(我的文件路径)+(随机用户文件名)是完全未知的。

  5. 以上也会在尝试将非常随机/长文件名写入驱动器时产生问题。您必须考虑特殊字符,长度和修剪文件名的问题(由于扩展名已被修剪,用户可能不会收到工作文件)。
  6. 执行 - 操作系统很容易执行名为.exe或.php或(插入其他扩展名)的文件。当没有扩展名时很难。

  7. URL编码 - 确保名称是URL安全的。 Cake Recipe.doc 不是URL安全名称,在某些系统(服务器或浏览器端)/某些情况下,可能会导致不一致,当名称应该是 urlencode d值。

至于存储信息,您通常会在数据库中执行此操作,与您需要的数据库没有区别,因为您需要一种方式来引用文件(谁上传,名称是什么,偶尔存储它,上传的时间,有时大小)。您只需简单地添加文件的实际存储名称以及该文件的用户名称即可。



OWASP的建议并不错 - 使用文件名和时间戳(不是日期)将大多是唯一的。我更进一步地将microtime与时间戳一起包括进来,并且通常包含一些其他独特的信息,以便在相同的时间范围内重复上传一个小文件 - 我也保存上传的日期这是针对md5冲突的额外保险,这在存储多个文件和多年的系统中具有较高的可能性。在同一天使用文件名和microtime生成两个类似的md5是非常不可能的。一个例子是:

  $ filename = date('Ymd')。 '_'。 md5($ uploaded_filename。microtime()); 

我的2美分。


I've been reading up on PHP file upload security and a few articles have recommended renaming the files. For example, the OWASP article Unrestricted File Upload says:

It is recommended to use an algorithm to determine the filenames. For instance, a filename can be a MD5 hash of the name of file plus the date of the day.

If a user uploads a file named Cake Recipe.doc is there really any reason to rename it to 45706365b7d5b1f35?

If the answer is yes, for whatever reason, then how do you keep track of the original file name and extension?

解决方案

To your primary question, is it good practice to rename files, the answer is a definite yes, especially if you are creating a form of File Repository where users upload files (and filenames) of their choosing, for several reason:

  1. Security - if you have a poorly written application that allows the download of files by name or through direct access (it's a horrid, but it happens), it's much harder for a user, whether maliciously or on purpose, to "guess" the names of files.
  2. Uniqueness -- the likelihood of two different people uploading a file of the same name is very high (ie. avatar.gif, readme.txt, video.avi, etc). The use of a unique identifier significantly decreases the likelihood that two files will be of the same name.
  3. Versioning -- It is much easier to keep multiple "versions" of a document using unique names. It also avoids the need for additional code to parse a filename to make changes. A simple example would document.pdf to document(1).pdf, which becomes more complicated when you don't underestimate users abilities to create horrible names for things.
  4. Length -- working with known filename lengths is always better than working with unknown filename lengths. I can always know that (my filepath) + (X letters) is a certain length, where (my filepath) + (random user filename) is completely unknown.
  5. OS -- the length above can also create problems when attempting to write extremely random/long filenames to a drive. You have to account for special characters, lengths and the concerns for trimmed filenames (user may not receive a working file because the extension has been trimmed).
  6. Execution -- It's easy for the OS to execute a file named .exe, or .php, or (insert other extension). It's hard when there isn't an extension.
  7. URL encoding -- Ensuring the name is URL safe. Cake Recipe.doc is not a URL safe name, and can on some systems (either server or browser side) / some situations, cause inconsistencies when the name should be a urlencoded value.

As for storing the information, you would typically do this in a database, no different than the need you have already, since you need a way to refer back to the file (who uploaded, what the name is, occassionally where it is stored, the time of upload, sometimes the size). You're simply adding to that the actual stored name of the file in addition to the user's name for the file.

The OWASP recommendation isn't a bad one -- using the filename and a timestamp (not date) would be mostly unique. I take it a step further to include the microtime with the timestamp, and often some other unique bit of information, so that a duplicate upload of a small file couldn't occur in the same timeframe -- I also store the date of the upload which is additional insurance against md5 clashes, which has a higher probability in systems that store many files and for years. It is incredibly unlikely that you would generate two like md5s, using filename and microtime, on the same day. An example would be:

$filename = date('Ymd') . '_' . md5($uploaded_filename . microtime());

My 2 cents.

这篇关于上传的文件应该重新命名?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆