GridFS是否足够快速,可靠地用于生产? [英] Is GridFS fast and reliable enough for production?

查看:72
本文介绍了GridFS是否足够快速,可靠地用于生产?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我开发了一个新网站,并希望将GridFS用作所有用户上传的存储,因为与普通文件系统存储相比,它具有很多优势.

nginx提供的GridFS的基准表明,它不如nginx提供的普通文件系统快.

使用Nginx进行基准测试

有没有人在生产环境中使用过GridFS,或者将其用于新项目?

解决方案

我在我们的一台服务器上使用gridfs,该服务器是价格比较可观的网站的一部分,该网站具有可观的流量统计信息(每天约有2万5千访客).服务器没有太多ram,2gigs,甚至cpu也不是很快(Core 2 duo 1.8Ghz),但是服务器有足够的存储空间:raid 0配置中为10Tb(sata).服务器正在做的工作很简单:

我们的价格比较器上的每个产品都有一个映像(根据我们的产品数据库,大约有1000万个产品),服务器的工作是下载映像,调整大小,将其存储在gridfs上并将其交付给访问者浏览器...(如果它不存在于网格中)...或...如果它已经存储在网格中,则将其传递给访问者浏览器.因此,这可以称为传统CDN模式".

由于该服务器已启动并正在运行,因此我们已在该服务器上存储和处理了400万张图像.调整大小和存储内容是通过一个简单的php脚本完成的,但是可以肯定的是,使用python脚本或类似Java的脚本可能会更快.

当前数据大小:11.23g

当前存储大小:12.5克

指数:5

索引大小:849.65m

关于可靠性:这是非常可靠的.服务器未加载,索引大小正常,查询速度很快

关于速度:可以肯定的是,它是否不像本地文件存储那样快,可能会慢10%,但是足够快以至于即使在需要处理图像时也可以实时使用,在我们的情况下,这非常依赖于php .维护和开发时间也减少了:删除单个或多个映像变得如此简单:只需使用简单的delete命令查询数据库.另一个有趣的事情是:当我们使用本地文件存储重新启动旧服务器时(成千上万个文件夹中有上百万个文件),由于系统正在执行文件完整性检查(有时要花费数小时……),它有时会挂几个小时. gridfs不再存在此问题,我们的图像现在存储在较大的mongodb块(2gb文件)中

所以...在我看来...是的,gridfs足够快速,可靠,可以用于生产.

I develop a new website and I want to use GridFS as storage for all user uploads, because it offers a lot of advantages compared to a normal filesystem storage.

Benchmarks with GridFS served by nginx indicate, that it's not as fast as a normal filesystem served by nginx.

Benchmark with nginx

Is anyone out there, who uses GridFS already in a production environment, or would use it for a new project?

解决方案

I use gridfs at work on one of our servers which is part of a price-comparing website with honorable traffic stats (arround 25k visitors per day). The server hasn't much ram, 2gigs, and even the cpu isn't really fast (Core 2 duo 1.8Ghz) but the server has plenty storage space : 10Tb (sata) in raid 0 configuration. The job the server is doing is very simple:

Each product on our price-comparer has an image (there are around 10 million products according to our product db), and the servers job is to download the image, resize it, store it on gridfs, and deliver it to the visitors browser... if it's not present in the grid... or... deliver it to the visitors browser if it's already stored in the grid. So, this could be called as a 'traditional cdn schema'.

We have stored and processed 4 million images on this server since it's up and running. The resize and store stuff is done by a simple php script... but for sure, a python script, or something like java could be faster.

Current data size : 11.23g

Current storage size : 12.5g

Indices : 5

Index size : 849.65m

About the reliability : This is very reliable. The server doesn't load, the index size is ok, queries are fast

About the speed : For sure, is it not fast as local file storage, maybe 10% slower, but fast enough to be used in realtime even when the image needs to be processed, which is in our case, very php dependant. Maintenance and development times have also been reduced: it became so simple to delete a single or multiple images : just query the db with a simple delete command. Another interesting thing : when we rebooted our old server, with local file storage (so million of files in thousands of folders), it sometimes hangs for hours cause the system was performing a file integrity check (this really took hours...). We do not have this problem any more with gridfs, our images are now stored in big mongodb chunks (2gb files)

So... on my mind... Yes, gridfs is fast and reliable enough to be used for production.

这篇关于GridFS是否足够快速,可靠地用于生产?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆