从MS Access中提取OLE对象(Word文档) [英] Extract OLE Object (Word document) from MS Access

查看:256
本文介绍了从MS Access中提取OLE对象(Word文档)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个Microsoft Access数据库,其中的OLE Object字段包含一个Microsoft Word文档. 我试图找到代码来检索保存在OLE对象中的文件,以便用户可以从JavaFx应用程序中的按钮下载该文件,但是我没有成功.

I have a Microsoft Access database with an OLE Object field holding a Microsoft Word document. I have tried to find code to retrieve the file saved in the OLE Object, so that the user can download it from a button in my JavaFx application, but I had no success.

我有以下内容,但此后我不知道该怎么办.另外,inputStream始终为空.

I have the following but I don't know what to do after this. Also, inputStream is always null.

InputStream inputStream = res.getBinaryStream(6);  

推荐答案

在从数据库中获取二进制数据方面,您似乎处在正确的轨道上.以下代码对我来说适用于Java 7下的UCanAccess 3.0.0,其中[Doc]是Access表中的OLE Object字段:

You seem to be on the right track with regard to getting the binary data out of the database. The following code works for me with UCanAccess 3.0.0 under Java 7, where [Doc] is an OLE Object field in an Access table:

String sql = "SELECT Doc FROM OleTest WHERE ID=1";
try (Statement st = conn.createStatement();
        ResultSet rs = st.executeQuery(sql)) {
    rs.next();
    InputStream inputStream = rs.getBinaryStream(1);
    File f = new File("C:/Users/Gord/Desktop/thing.bin");
    Files.copy(
            inputStream, 
            f.toPath(), 
            java.nio.file.StandardCopyOption.REPLACE_EXISTING);
}

现在的问题是该字段是否包含Word文档

Now the question is whether the field contained a Word document

  1. 原始二进制格式,或
  2. 作为真正的OLE(包装")对象.

如果该字段包含原始二进制格式的文档,那么我们可以将文件重命名为.docx并直接在Word中打开.

If the field contained a document in raw binary format then we could just rename the file to .docx and open it directly in Word.

但是,在我的情况下,它存储为包装的" OLE对象,因为我已经使用Access本身中的插入对象..."将文档嵌入表中.因此,原始格式的.docx(Word)文档看起来像这样...

However, in my case it was stored as a "wrapped" OLE Object because I had imbedded the document into the table using "Insert object..." in Access itself. Therefore the .docx (Word) document, which in raw form looks like this ...

...从数据库中提取,并带有"OLE包装器":

... is extracted from the database with its "OLE wrapper" around it:

如果我们从数据库中向下搜索OLE数据,我们可以看到原始二进制数据的开始,在这种情况下,偏移量为0xA57:

If we search down through the OLE data from the database we can see the beginning of the raw binary data, in this case at offset 0xA57:

因此,不幸的是,我们不能简单地将OLE二进制数据保存到文件中,然后直接在Word中打开该文件,因为它不是有效的Word文件.

So, unfortunately we cannot simply save the OLE binary data into a file and then open that file directly in Word because it is not a valid Word file.

删除OLE包装器"可能很棘手.某些文件格式被设计为忽略文件末尾的无关字节,因此,诸如此答案(仅删除OLE包装程序的前部")就可以与BMP,JPEG等图像文件格式一起使用.不幸的是,Word文档在文件末尾对垃圾"的容忍度要低得多,因此只需删除OLE包装的前部"仍然会导致Word无法打开文件.

Removing OLE "wrappers" can be tricky. Some file formats are designed to ignore extraneous bytes at the end of a file, so approaches like the one described in this answer (which just removes the "front part" of the OLE wrapper) can be used with image file formats like BMP, JPEG, etc.. Unfortunately, Word documents are much less forgiving of "junk" at the end of the file so just removing the "front part" of the OLE wrapper can still result in a file that Word cannot open.

这篇关于从MS Access中提取OLE对象(Word文档)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆