在庞大的SQL Server数据库BLOB数据 [英] Blob data in huge SQL Server database

查看:93
本文介绍了在庞大的SQL Server数据库BLOB数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们每年都20.000.000产生TEXTFILES,平均规模约为250 KB的每个(35 KB压缩)。

We have 20.000.000 generated textfiles every year, average size is approx 250 Kb each (35 Kb zipped).

我们必须把这些文件在某种存档的10年。无需搜索里面TEXTFILES,但我们必须能够在5-10元数据字段搜索,如产品名称,creationdate等。

We must put these files in some kind of archive for 10 years. No need to search inside textfiles, but we must be able to find one texfile by searching on 5-10 metadata fields such as "productname", "creationdate", etc.

我在考虑压缩和解的每个文件,并在SQL Server数据库5-10搜索(索引)的列存储它们和VARBINARY(MAX)列压缩的文件数据。

I'm considering zipping each file and storing them in a SQL Server database with 5-10 searchable (indexed) columns and a varbinary(MAX) column for the zipped file data.

该数据库将在年内增长巨大; 5-10铽。因此,我认为,我们需要通过保持每年一个数据库分区,例如数据。

The database will be grow huge over the years; 5-10 Tb. So I think we need to partition data for example by keeping one database per year.

我一直在寻找到SQL Server中保存数据的VARBINARY列使用FILESTREAM,但似乎这是更适合的斑点> 1兆?

I've been looking into using FILESTREAM in SQL Server for the varbinary column that holds the data, but it seems this is more suitable for blobs > 1 Mb?

这是如何管理这些数据卷的任何其他建议?

Any other suggestions on how to manage such data volumes?

推荐答案

文件流肯定是更适合于较大的斑点(750KB-1MB)以打开外部文件所需的开销开始影响读写性能与VB(最大)Blob存储的小文件。如果不是这么多的一个问题(即最初的写操作之后BLOB数据读取并不多见,而斑点是有效不可变的),那么它肯定是一个选项。

Filestream is definitely more suited to larger blobs (750kB-1MB) as the overhead required to open the external file begins to impact read and write performance vs. vb(max) blob storage for small files. If this is not so much of an issue (ie. reads of blob data after the initial write are infrequent, and the blobs are effectively immutable) then it's definitely an option.

我可能会建议直接在VB(max)列保持文件,如果你可以使用TEXTIMAGE_ON选项,将允许您以保证他们不会在规模大得多,但储存在一个单独的文件组此表它从元数据,如果必要的其他部分移动到不同的存储。此外,确保设计你的架构,以便斑点的实际存储空间可以拆分成多个文件组或者使用分区或通过某种多个表的方案,以便在需要时可以在未来。扩展到不同的磁盘

I would probably suggest keeping the files directly in a vb(max) column if you can guarantee they won't get much larger in size, but have this table stored in a seperate filegroup using the TEXTIMAGE_ON option which would allow you to move it to different storage from the rest of the metadata if necessary. Also, make sure to design your schema so the actual storage of blobs can be split over multiple filegroups either using partitions or via some multiple table scheme so you can scale to different disks if necessary in the future.

保持或者通过文件流或直接用vb的SQL​​元数据直接相关的斑点(最大)存储有超过处理限制,以缓解备份等管理操作的文件系统/ SQL矛盾没有多少优势。

Keeping the blobs directly associated with the SQL metadata either via Filestream or direct vb(max) storage has many advantages over dealing with filesystem / SQL inconsistencies not limited to ease of backup and other management operations.

这篇关于在庞大的SQL Server数据库BLOB数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆