如何提取OLE容器的内容? [英] How to extract the contents of an OLE container?

查看:192
本文介绍了如何提取OLE容器的内容?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要打开一个MS Word文件(.doc),并提取其组成文件('[1] CompObj','WordDocument'等)。像7-zip这样可以手动完成,但我需要这样做程序。

I need to break open a MS Word file (.doc) and extract its constituent files ('[1]CompObj', 'WordDocument' etc). Something like 7-zip can be used to do this manually but I need to do this programatically.

我收集了一个Word文档是一个OLE容器7-zip可以用来查看其内容),但我不能解决如何(使用C ++):

I've gathered that a Word document is an OLE container (hence why 7-zip can be used to view its contents) but I can't work out how to (using C++):


  1. 打开OLE容器

  2. 提取每个组成文件并将其保存到磁盘

的OLE自动化实例(例如这里),但我想做的似乎不常见,我没有找到具体的例子。

I've found a couple of examples of OLE automation (eg here) but what I want to do seems to be less common and I've found no specific examples.

如果任何人有任何一个API(?!)和教程使用OLE我会感激。 Ditto any code samples。

If anyone has any idea of either an API (?!) and tutorial for working with OLE I'd be grateful. Ditto any code samples.

推荐答案

它被称为复合文件,结构化存储API的一部分。你从StgOpenStorageEx()开始。它购买你一个Word .doc文件,流本身有一个复杂的二进制格式。要真正读取要使用自动化的文档内容,让Word读取该文件。这在C ++中很少做,但该项目显示了如何。

It is called Compound Files, part of the Structured Storage API. You start with StgOpenStorageEx(). It buys you little for a Word .doc file, the streams themselves have a sophisticated binary format. To really read the document content you want to use automation, letting Word read the file. That's rarely done in C++ but that project shows you how.

这篇关于如何提取OLE容器的内容?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆