处理MS Word文件中文本的最简单方法 [英] Easiest way to process text from MS word file
问题描述
我需要从C#中的旧MS word .doc文件中提取文本. 完成这项工作的最简单(或最好)的方法是什么?
i need to extract text from an old MS word .doc file in C#. What is the easiest (or else the best) way to get that job done?
推荐答案
首先,您需要添加MS Word对象库.转到Project => Add Reference,选择COM选项卡,然后找到并选择"Microsoft Word 10.0 Object Library".您的计算机上的版本号可能不同.点击确定.
First, you need to add in the MS Word object library. Go to Project => Add Reference, select the COM tab, then find and select "Microsoft Word 10.0 Object Library". The version number might be different on your computer. Click OK.
完成此操作后,可以使用以下代码.它将打开一个MS Word文档,并在消息框中显示每个段落-
After you have done that, you can use the following code. It will open up an MS Word doc, and display each paragraph in a message box -
// Read an MS Word Doc
private void ReadWordDoc()
{
try
{
Word.ApplicationClass wordApp = new Word.ApplicationClass();
// Define file path
string fn = @"c:\test.doc";
// Create objects for passing
object oFile = fn;
object oNull = System.Reflection.Missing.Value;
object oReadOnly = true;
// Open Document
Word.Document Doc = wordApp.Documents.Open(ref oFile, ref oNull,
ref oReadOnly, ref oNull, ref oNull, ref oNull, ref oNull,
ref oNull, ref oNull, ref oNull, ref oNull, ref oNull,
ref oNull, ref oNull, ref oNull);
// Read each paragraph and show
foreach (Word.Paragraph oPara in Doc.Paragraphs)
MessageBox.Show(oPara.Range.Text);
// Quit Word
wordApp.Quit(ref oNull, ref oNull, ref oNull);
}
catch (Exception ex)
{
MessageBox.Show(ex.Message);
}
}
这篇关于处理MS Word文件中文本的最简单方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!