如何从MS Word中的行号获取文本 [英] How to get text from line number in MS Word

查看:104
本文介绍了如何从MS Word中的行号获取文本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否可以使用Office Automation从MS Word中的给定行号获取文本(行或句子)?我的意思是,如果我可以获取给定行号中的文本或该行一部分的句子本身,就可以了.

Is it possible to get text (line or sentence) from a given line number in MS Word using office automation? I mean its ok if I can get either the text in the given line number or the sentence(s) itself which is a part of that line.

我没有提供任何代码,因为我完全不知道如何使用办公自动化来读取MS Word.我可以像这样打开文件:

I am not providing any code because I have absolutely no clue how an MS Word is read using office automation. I can go about opening the file like this:

var wordApp = new ApplicationClass();
wordApp.Visible = false;
object file = path;
object misValue= Type.Missing; 
Word.Document doc = wordApp.Documents.Open(ref file, ref misValue, ref misValue,
                                           ref misValue, ref misValue, ref misValue,
                                           ref misValue, ref misValue, ref misValue,
                                           ref misValue, ref misValue, ref misValue);

//and rest of the code given I have a line number = 3 ?

为澄清@Richard Marskell-Drackir的疑问,尽管MS Word中的文本是一长串字符串,但是办公自动化仍然让我们知道行号.实际上,我是从另一段代码中获得行号本身的,例如:

To clarify @Richard Marskell - Drackir's doubt, though text in MS Word is a long chain of string, office automation does still let us know line number. In fact I get the line number itself from another piece of code, like this:

Word.Revision rev = //SomeRevision
object lineNo = rev.Range.get_Information(Word.WdInformation.wdFirstCharacterLineNumber);

例如说Word文件看起来像这样:

For instance say the Word file looks like this:

fix grammatical or spelling errors

clarify meaning without changing it correct minor mistakes add related resources or links
always respect the original author

这里有4行.

推荐答案

幸运的是,经过一些史诗般的搜索后,我得到了一个解决方案.

Fortunately after some epic searching I got a solution.

    object file = Path.GetDirectoryName(Application.ExecutablePath) + @"\Answer.doc";

    Word.Application wordObject = new Word.ApplicationClass();
    wordObject.Visible = false;

    object nullobject = Missing.Value;
    Word.Document docs = wordObject.Documents.Open
        (ref file, ref nullobject, ref nullobject, ref nullobject,
        ref nullobject, ref nullobject, ref nullobject, ref nullobject,
        ref nullobject, ref nullobject, ref nullobject, ref nullobject,
        ref nullobject, ref nullobject, ref nullobject, ref nullobject);

    String strLine;
    bool bolEOF = false;

    docs.Characters[1].Select();

    int index = 0;
    do
    {
        object unit = Word.WdUnits.wdLine;
        object count = 1;
        wordObject.Selection.MoveEnd(ref unit, ref count);

        strLine = wordObject.Selection.Text;
        richTextBox1.Text += ++index + " - " + strLine + "\r\n"; //for our understanding

        object direction = Word.WdCollapseDirection.wdCollapseEnd;
        wordObject.Selection.Collapse(ref direction);

        if (wordObject.Selection.Bookmarks.Exists(@"\EndOfDoc"))
            bolEOF = true;
    } while (!bolEOF);

    docs.Close(ref nullobject, ref nullobject, ref nullobject);
    wordObject.Quit(ref nullobject, ref nullobject, ref nullobject);
    docs = null;
    wordObject = null;

此处是代码背后的天才.请点击链接以获取有关其工作原理的更多说明.

Here's the genius behind the code. Follow the link for some more explanation on how it works.

这篇关于如何从MS Word中的行号获取文本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆