将PDF文件作为二进制对象存储在SQL Server中,是或否? [英] Storing PDF files as binary objects in SQL Server, yes or no?

查看:232
本文介绍了将PDF文件作为二进制对象存储在SQL Server中,是或否?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我必须为以下任务找到设计决策:



我有一个SQL Server数据库,它包含一个订单表。 PDF文件将由用户通过简单的文件上传从网页上传并分配到订单。每个订单没有一个以上的文档(也许没有文档,绝不超过一个)。为此,用户打开网页,输入订单号码,获取订单并点击上传按钮。所以我知道上传的文件属于哪个订单。



现在我正在考虑将两个选项存储在Web服务器上:



1)通过varbinary(MAX)列扩展我的订单表,并将PDF文档直接存储到该二进制字段中。



2)保存PDF文件在磁盘上的特定文件夹中,并为其提供与订单相关的唯一名称(例如,我的订单号是数据库中的主键,或者可以存储在订单表的附加列中的GUID) 。也许我必须将文件存储在子文件夹中,每月一次,并将子文件夹名称存储在数据库中的顺序行中,以避免在一个文件夹中收到超过一千个文件。



我收到选项(1)后,因为数据管理似乎更容易让我在一个数据库中拥有所有相关数据。但是我有点害怕,随着时间的推移,我可以遇到性能问题,因为我的数据库大小将比解决方案(2)快得多。数据库大小的大约90%甚至95%只能由那些存储的PDF文件组成。



这是一些额外的信息:




  • PDF文件的大小约为100千字节

  • 每月约1500份订单/ PDF文件

  • Windows Server 2008 R2 / IIS 7.5

  • SQL Server 2008 SP1 Express

  • 不太确定硬件,我相信一个QuadCore Proc。和4 GB RAM

  • 应用程序是用ASP.NET Webforms 3.5 SP1编写的。



我知道我将在2年之后达到SQL Server Express版本的4GB限制,上面是数字,但是我们可以在这里忽略这一点,从数据库中删除旧数据或升级到完整的许可证将是一个可能的选项。)



我的问题是:什么是Pro和Contras的选项,你会推荐什么?也许有人有类似的任务,可以报告他的经历。



提前谢谢你的回复!


相关:



将图像存储在数据库中 - 是或否?



解决方案

p>对于SQL Server 2008,如果文档的大小大致为1 MB或更大,则建议使用FILESTREAM功能。这是基于Microsoft Research发布的一篇文章,称为 BLOB或不BLOB 分析了数据库中存储Blob的优缺点 - 很棒!



对于平均小于256K的文档,将它们存储在一个 VARBINARY(MAX)列似乎是最合适的。



之间的任何东西都是一个折腾-up,真的。



你说你会有大约100K左右的PDF文档 - >这些将非常好地存储在SQL Server表中,没有问题。您可能需要考虑的一个问题是连接到主事实表的文档有一个单独的表。这样,事实表的使用速度将会更快,文档不会妨碍您的其他数据。


I have to find a design decision for the following task:

I have a SQL Server database and it contains a table of orders. PDF documents will be uploaded by users through a simple file upload from a web page and assigned to an order. There is not more than one document per order (perhaps no document, never more than one). For this purpose a user opens a web page, enters an order number, gets the order displayed and clicks on an upload button. So I know to which order the uploaded document belongs to.

Now I am considering two options to store the documents on the web server:

1) Extend my table of orders by a varbinary(MAX) column and store the PDF document directly into that binary field.

2) Save the PDF file in a specific folder on disk and give it a unique name related to the order (for instance my order number which is a primary key in the database, or a GUID which I could store in an additional column of the order table). Perhaps I have to store the files in subfolders, one per month, and store the subfolder name into the order row in the database, to avoid getting too many thousand files in one folder.

After the PDF files are stored they can be downloaded and viewed via browser after entering the related order number.

I'm tending towards option (1) because the data management seems easier to me having all relevant data in one database. But I am a bit afraid that I could encounter performance issues over time since my database size will grow much faster than with solution (2). Around 90% or even 95% of the total database size would be made up only by those stored PDF files.

Here is some additional information:

  • The PDF files will have a size of around 100 Kilobyte each
  • Around 1500 orders/PDF files per month
  • Windows Server 2008 R2 / IIS 7.5
  • SQL Server 2008 SP1 Express
  • Not quite sure about the hardware, I believe one QuadCore Proc. and 4 GB RAM
  • Application is written in ASP.NET Webforms 3.5 SP1

(I am aware that I will reach the 4GB-limit of the SQL Server Express edition after around 2 years with the numbers above. But we can disregard this here, either removing old data from the database or upgrading to a full license will be a possible option.)

My question is: What are the Pro and Contras of the options and what would you recommend? Perhaps someone had a similar task and can report about his experience.

Thank you in advance for reply!

Related:

Storing Images in DB - Yea or Nay?

解决方案

With SQL Server 2008, when you have documents that are mostly 1 MB or more in size, the FILESTREAM feature would be recommended. This is based on a paper published by Microsoft Research called To BLOB or not to BLOB which analyzed the pros and cons of storing blobs in a database in great length - great read!

For documents of less than 256K on average, storing them in a VARBINARY(MAX) column seems to be the best fit.

Anything in between is a bit of a toss-up, really.

You say you'll have PDF documents mostly around 100K or so -> those will store very nicely into a SQL Server table, no problem. One thing you might want to consider is having a separate table for the documents that is linked to the main facts table. That way, the facts table will be faster in usage, and the documents don't get in the way of your other data.

这篇关于将PDF文件作为二进制对象存储在SQL Server中,是或否?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆