在Access数据库中提取的附件野外文件 [英] Extracting files from an Attachment field in an Access database

查看:393
本文介绍了在Access数据库中提取的附件野外文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们正在研究一个项目,我们需要迁移存储在Access数据库高速缓存数据库中的数据。 Access数据库包含列与附件的数据类型;一些元组包含多个附件。我可以用 .FileName 来获得这些文件的文件名,但我不能确定如何确定一个文件时结束,另一个开始于 .FileData

我使用下面的获取这些数据:

  System.Data.OleDb.OleDbCommand命令=新System.Data.OleDb.OleDbCommand();
command.CommandText =选择[Sheet1中] [PDF] .FileData,* FROM [Sheet1中。
command.Connection =康涅狄格州;
System.Data.OleDb.OleDbDataReader RDR = Command.ExecuteReader却();
 

解决方案

(我原来的这个问题的答案是误导性的,这对于随后打开使用Adobe Reader PDF文件的效果不错,但它并不总是工作适当的其他类型的文件。以下是修正版本。)

不幸的是,我们不能直接使用的OleDb检索文件的内容在Access 附件字段。 Jet数据库引擎prepends一些元数据文件的二进制内容,以及元数据包括在内,如果我们检索 .FileData 通过OleDb的。

要说明这一点,一个名为Document1.pdf文档保存到使用Access UI中的附件栏。该PDF文件的开头是这样的:

如果我们用以下code,试图提取PDF文件保存到磁盘

使用

(OleDbCommand的CMD =新的OleDbCommand()) {     cmd.Connection = CON;     cmd.CommandText =             选择Attachments.FileData+             从AttachTest+             WHE​​RE Attachments.FileName ='Document1.pdf';     使用(OleDbDataReader RDR = cmd.ExecuteReader())     {         rdr.Read();         字节[]的FileData =(字节[])RDR [0];         使用(VAR FS =新的FileStream(                 @C:\ Users \用户戈德\桌面\ FromFileData.pdf                 FileMode.Create,FileAccess.Write))         {             fs.Write(的FileData,0,fileData.Length);             fs.Close();         }     } }

然后得到的文件将包括在文件的开头的元数据(在此情况下,20字节)

Adob​​e Reader的能够打开这个文件,因为它是强大到足以忽略任何垃圾的%PDF-1.4签署之前,可能会出现在文件中。不幸的是不是所有的文件格式和应用程序是如此宽容外来字节在文件的开头。

官方及贸易;从附件解压文件的方式在Access领域是使用ACE DAO的 .SaveToFile 方法字段2 的对象,像这样:

//需要COM引用信息:Microsoft Office 14.0 Access数据库引擎对象库 // //使用Microsoft.Office.Interop.Access.Dao; ... VAR DBE =新用到dbengine(); 数据库DB = dbe.OpenDatabase(@C:\用户\公用\ Database1.accdb); 记录rstMain = db.OpenRecordset(         选择附件从AttachTest WHERE ID = 1,         RecordsetTypeEnum.dbOpenSnapshot); Recordset2 rstAttach = rstMain.Fields [附件]值。 而((Document1.pdf.Equals(rstAttach.Fields [文件名]值))及!&安培;!(rstAttach.EOF)) {     rstAttach.MoveNext(); } 如果(rstAttach.EOF) {     Console.WriteLine(未找到。); } 其他 {     场2 FLD =(场2)rstAttach.Fields [的FileData];     fld.SaveToFile(@C:\ Users \用户戈德\桌面\ FromSaveToFile.pdf); } db.Close();

请注意,如果您尝试使用的Field2对象的。价值你仍然会得到元数据的字节序列的开始;在 .SaveToFile 的过程就是剥离出来。

We are working on a project where we need to migrate data stored in an Access database to a cache database. The Access database contains columns with a data type of Attachment; some of the tuples contain multiple attachments. I am able to obtain the filenames of these files by using .FileName, but I'm unsure how to determine when one file ends and another starts in .FileData.

I am using the following to obtain this data:

System.Data.OleDb.OleDbCommand command= new System.Data.OleDb.OleDbCommand();
command.CommandText = "select [Sheet1].[pdf].FileData,* from [Sheet1]";
command.Connection = conn;
System.Data.OleDb.OleDbDataReader rdr = command.ExecuteReader();

解决方案

(My original answer to this question was misleading. It worked okay for PDF files that were subsequently opened with Adobe Reader, but it did not always work properly for other types of files. The following is the corrected version.)

Unfortunately we cannot directly retrieve the contents of a file in an Access Attachment field using OleDb. The Access Database Engine prepends some metadata to the binary contents of the file, and that metadata is included if we retrieve the .FileData via OleDb.

To illustrate, a document named "Document1.pdf" is saved to an Attachment field using the Access UI. The beginning of that PDF file looks like this:

If we use the following code to try and extract the PDF file to disk

using (OleDbCommand cmd = new OleDbCommand())
{
    cmd.Connection = con;
    cmd.CommandText = 
            "SELECT Attachments.FileData " +
            "FROM AttachTest " +
            "WHERE Attachments.FileName='Document1.pdf'";
    using (OleDbDataReader rdr = cmd.ExecuteReader())
    {
        rdr.Read();
        byte[] fileData = (byte[])rdr[0];
        using (var fs = new FileStream(
                @"C:\Users\Gord\Desktop\FromFileData.pdf", 
                FileMode.Create, FileAccess.Write))
        {
            fs.Write(fileData, 0, fileData.Length);
            fs.Close();
        }
    }
}

then the resulting file will include the metadata at the beginning of the file (20 bytes in this case)

Adobe Reader is able to open this file because it is robust enough to ignore any "junk" that may appear in the file before the '%PDF-1.4' signature. Unfortunately not all file formats and applications are so forgiving of extraneous bytes at the beginning of the file.

The only Official™ way of extracting files from an Attachment field in Access is to use the .SaveToFile method of an ACE DAO Field2 object, like so:

// required COM reference: Microsoft Office 14.0 Access Database Engine Object Library
//
// using Microsoft.Office.Interop.Access.Dao; ...
var dbe = new DBEngine();
Database db = dbe.OpenDatabase(@"C:\Users\Public\Database1.accdb");
Recordset rstMain = db.OpenRecordset(
        "SELECT Attachments FROM AttachTest WHERE ID=1",
        RecordsetTypeEnum.dbOpenSnapshot);
Recordset2 rstAttach = rstMain.Fields["Attachments"].Value;
while ((!"Document1.pdf".Equals(rstAttach.Fields["FileName"].Value)) && (!rstAttach.EOF))
{
    rstAttach.MoveNext();
}
if (rstAttach.EOF)
{
    Console.WriteLine("Not found.");
}
else
{
    Field2 fld = (Field2)rstAttach.Fields["FileData"];
    fld.SaveToFile(@"C:\Users\Gord\Desktop\FromSaveToFile.pdf");
}
db.Close();

Note that if you try to use the .Value of the Field2 object you will still get the metadata at the beginning of the byte sequence; the .SaveToFile process is what strips it out.

这篇关于在Access数据库中提取的附件野外文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆