如何解析文本从MS Word文档字符串 [英] How to parse text from MS Word document to string

查看:163
本文介绍了如何解析文本从MS Word文档字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图找到一种方法,一个word文档的文本解析在我project.I字符串有超过600字(.DOC),我需要得到的文本内容(新行和制表符,如果文件可能的话),并将其分配给每一个字符串。

I am trying to find a way to parse a word document's text to a string in my project.I have more than 600 word(.doc) files that I need to get the text content(with the new lines and tabs if possible) and assign it to a string for each one.

我一直在阅读的东西有关Open XML SDK,但它看起来的东西,看起来那么简单很复杂。

I've been reading stuff about the Open XML SDK but it looks quite complicated for something that looks so simple.

推荐答案

开放XML SDK只适用于2007年和较新的格式,这是不平凡的使用。

Open XML SDK is only for 2007 and newer formats and it is not trivial to use.

如果性能不是你可以使用Word自动化让Word为你做这个的问题。
它看起来是这样的:

If performance is not an issue you could use Word Automation and have Word do this for you. It will look something like this:

var app = new Application();
var doc = app.Documents.Open(documentLocation);

string rangeText = doc.Range().Text;

doc.Save();
doc.Close();

Marshal.ReleaseComObject(doc);    
Marshal.ReleaseComObject(app);



看看的 http://www.codeproject.com/Articles/18703/Word-2007-Automation 或的 http://www.codeproject.com/Articles/21247/Word-Automation 以更完整的例子和说明。请注意,这可能会变得有点比较麻烦,如果你的文档是移动复杂(脚注,文本框,表格...)。

Take a look at http://www.codeproject.com/Articles/18703/Word-2007-Automation or http://www.codeproject.com/Articles/21247/Word-Automation for more complete examples and instructions. Note that this may become a bit more tricky if your documents are move complex (footnotes, text boxes, tables...).

另一种选择是有字保存文档作为一个文本,然后阅读文本文件。看看这个 - 的 http://msdn.microsoft.com/en-us/library/microsoft.office.tools.word.document.saveas(v = VS.80)的.aspx

Another option is have word save the document as a text and then read the text file. Take a look at this - http://msdn.microsoft.com/en-us/library/microsoft.office.tools.word.document.saveas(v=vs.80).aspx

这篇关于如何解析文本从MS Word文档字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆