图片上传存储策略 [英] Image upload storage strategies

查看:105
本文介绍了图片上传存储策略的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当用户将图像上传到我的网站时,该图像将经历此过程;

When a user uploads an image to my site, the image goes through this process;

  • 用户上传图片
  • 将图片元数据存储在db中,为图像赋予唯一的ID
  • 异步图像处理(缩略图创建,裁剪等)
  • 所有图像都存储在同一上传文件夹中

到目前为止,该网站很小,上载目录中只有约200,000张图像.我意识到我远没有目录中文件的物理限制,但是这种方法显然无法扩展,因此我想知道是否有人对处理大量图像上载的上载/存储策略有任何建议.

So far the site is pretty small, and there are only ~200,000 images in the uploads directory. I realise I'm nowhere near the physical limit of files within a directory, but this approach clearly won't scale, so I was wondering if anyone had any advice on upload / storage strategies for handling large volumes of image uploads.

创建用户名(或更具体地说,是userid)子文件夹似乎是一个不错的解决方案.经过进一步的挖掘,我在这里找到了一些很棒的信息; 如何在文件系统中存储图像
但是,如果将CDN纳入等式,此userid dir方法是否可以很好地扩展?

Creating username (or more specifically, userid) subfolders would seem to be a good solution. With a bit more digging, I've found some great info right here; How to store images in your filesystem
However, would this userid dir approach scale well if a CDN is bought into the equation?

推荐答案

我之前也回答过类似的问题,但是我找不到,也许OP删除了他的问题...

I've answered a similar question before but I can't find it, maybe the OP deleted his question...

无论如何, Adams解决方案到目前为止是最好的解决方案它不是防弹的,因为images/c/cf/(或任何其他目录/子目录对)仍然可以包含多达16 ^ 30个唯一的散列,并且如果我们计算图像扩展名,则文件至少要多3倍,这很多比任何常规文件系统都无法胜任.

Anyway, Adams solution seems to be the best so far, yet it isn't bulletproof since images/c/cf/ (or any other dir/subdir pair) could still contain up to 16^30 unique hashes and at least 3 times more files if we count image extensions, a lot more than any regular file system can handle.

AFAIK,SourceForge.net也将此系统用于项目存储库,例如"fatfree"项目将放置在projects/f/fa/fatfree/处,但是我相信它们会将项目名称限制为8个字符.

AFAIK, SourceForge.net also uses this system for project repositories, for instance the "fatfree" project would be placed at projects/f/fa/fatfree/, however I believe they limit project names to 8 chars.

我将图像哈希值与DATE/DATETIME/TIMESTAMP字段一起存储在数据库中,该字段指示何时上传/处理图像,然后将图像放置在这样的结构中:

I would store the image hash in the database along with a DATE / DATETIME / TIMESTAMP field indicating when the image was uploaded / processed and then place the image in a structure like this:

images/
  2010/                                      - Year
    04/                                      - Month
      19/                                    - Day
        231c2ee287d639adda1cdb44c189ae93.png - Image Hash

或者:

images/
  2010/                                    - Year
    0419/                                  - Month & Day (12 * 31 = 372)
      231c2ee287d639adda1cdb44c189ae93.png - Image Hash

除了更具描述性,此结构还足以托管数十万(取决于您的文件系统限制)数千年每天的图像,这就是Wordpress和其他人这样做的方式,我认为他们在这一方面做对了.

Besides being more descriptive, this structure is enough to host hundreds of thousands (depending on your file system limits) of images per day for several thousand years, this is the way Wordpress and others do it, and I think they got it right on this one.

可以轻松地在数据库中查询重复的图像,而您只需要创建符号链接即可.

Duplicated images could be easily queried on the database and you'd just have to create symlinks.

当然,如果这还不够,您可以随时添加更多子目录(小时,分钟,...).

Of course, if this is not enough for you, you can always add more subdirs (hours, minutes, ...).

除非您在数据库中没有可用的个人信息,否则我不会使用用户ID,因为:

Personally I wouldn't use user IDs unless you don't have that info available in your database, because:

  1. 在URL中公开用户名
  2. 用户名易变(您可以重命名文件夹,但仍然...)
  3. 假设用户可以上传大量图像
  4. 毫无用处(?)

关于CDN,我看不出该方案(或任何其他方案)不起作用的任何原因...

Regarding the CDN I don't see any reason this scheme (or any other) wouldn't work...

这篇关于图片上传存储策略的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆