处理MS Word文件中文本的最简单方法 [英] Easiest way to process text from MS word file

查看:137
本文介绍了处理MS Word文件中文本的最简单方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要从C#中的旧MS word .doc文件中提取文本. 完成这项工作的最简单(或最好)的方法是什么?

i need to extract text from an old MS word .doc file in C#. What is the easiest (or else the best) way to get that job done?

推荐答案

首先,您需要添加MS Word对象库.转到Project => Add Reference,选择COM选项卡,然后找到并选择"Microsoft Word 10.0 Object Library".您的计算机上的版本号可能不同.点击确定.

First, you need to add in the MS Word object library. Go to Project => Add Reference, select the COM tab, then find and select "Microsoft Word 10.0 Object Library". The version number might be different on your computer. Click OK.

完成此操作后,可以使用以下代码.它将打开一个MS Word文档,并在消息框中显示每个段落-

After you have done that, you can use the following code. It will open up an MS Word doc, and display each paragraph in a message box -

// Read an MS Word Doc
private void ReadWordDoc()
{
    try
    {
        Word.ApplicationClass wordApp = new Word.ApplicationClass();

        // Define file path
        string fn = @"c:\test.doc";

        // Create objects for passing
        object oFile = fn;
        object oNull = System.Reflection.Missing.Value;
        object oReadOnly = true;

        // Open Document
        Word.Document Doc = wordApp.Documents.Open(ref oFile, ref oNull, 
                ref oReadOnly, ref oNull, ref oNull, ref oNull, ref oNull, 
                ref oNull, ref oNull, ref oNull, ref oNull, ref oNull, 
                ref oNull, ref oNull, ref oNull);

        // Read each paragraph and show         
        foreach (Word.Paragraph oPara in Doc.Paragraphs)                
            MessageBox.Show(oPara.Range.Text);

        // Quit Word
        wordApp.Quit(ref oNull, ref oNull, ref oNull);

    }
    catch (Exception ex)
    {
        MessageBox.Show(ex.Message);
    }

}

这篇关于处理MS Word文件中文本的最简单方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆