VSTO 2007:如何确定范围的页码和段落编号? [英] VSTO 2007: how do I determine the page and paragraph number of a Range?

查看:29
本文介绍了VSTO 2007:如何确定范围的页码和段落编号?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在构建一个 MS Word 插件,它必须从文档中收集所有注释气球并将它们汇总到一个列表中.我的结果将是一个 ReviewItem 类列表,其中包含评论本身、段落编号和评论文本所在的页码.

I'm building an MS Word add-in that has to gather all comment balloons from a document and summarize them in a list. My result will be a list of ReviewItem classes containing the Comment itself, the paragraph number and the page number on which the commented text resides.

我的部分代码如下所示:

Part of my code looks like this:

    private static List<ReviewItem> FindComments()
    {
        List<ReviewItem> result = new List<ReviewItem>();
        foreach (Comment c in WorkingDoc.Comments)
        {
            ReviewItem item = new ReviewItem()
            {
                Remark = c.Reference.Text,
                Paragraph = c.Scope. ???, // How to determine the paragraph number?
                Page = c.Scope. ??? // How to determine the page number?
            };
            result.Add(item);
        }
        return result;
   }

Comment 类的 Scope 属性指向文档中评论所涉及的实际文本,其类型为 Microsoft.Office.Interop.Word.Range.我不知道如何确定该范围位于哪个页面和哪个段落.

The Scope property of the Comment class points to the actual text in the document the comment is about and is of type Microsoft.Office.Interop.Word.Range. I can't work out how to determine what page and which paragraph that range is located.

段落编号,其实是指段落的编号列表"编号,例如2.3"或1.3.2".

With paragraph number, I actually mean the "numbered list" number of the paragraph, such as "2.3" or "1.3.2".

有什么建议吗?谢谢!

推荐答案

在 Mike Regan 在他的回答中给我的帮助(再次感谢 Mike),我设法找到了一个我想在这里分享的解决方案.也许这也阐明了我的目标是什么.就性能而言,这可能不是最快或最有效的解决方案.随时提出改进建议.

With the help Mike Regan gave me in his answer (thanks again Mike), I managed to work out a solution that I want to share here. Maybe this also clarifies what my goal was. In terms of performance, this might not be the fastest or most efficient solution. Feel free to suggest improvements.

我的代码的结果是一个 ReviewItem 类的列表,它将在其他地方进行处理.废话不多说,代码如下:

The result of my code is a list of ReviewItem classes, that will be processed elsewhere. Without further ado, here's the code:

/// <summary>
/// Worker class that collects comments from a Word document and exports them as ReviewItems
/// </summary>
internal class ReviewItemCollector
{
    /// <summary>
    /// Working document
    /// </summary>
    private Word.Document WorkingDoc = new Word.DocumentClass();

    /// <summary>
    /// Extracts the review results from a Word document
    /// </summary>
    /// <param name="fileName">Fully qualified path of the file to be evaluated</param>
    /// <returns></returns>
    public ReviewResult GetReviewResults(string fileName)
    {
        Word.Application wordApp = null;
        List<ReviewItem> reviewItems = new List<ReviewItem>();

        object missing = System.Reflection.Missing.Value;

        try
        {
            // Fire up Word
            wordApp = new Word.ApplicationClass();

            // Some object variables because the Word API requires this
            object fileNameForWord = fileName;
            object readOnly = true;

            WorkingDoc = wordApp.Documents.Open(ref fileNameForWord,
                ref missing, ref readOnly,
                ref missing, ref missing, ref missing, ref missing, ref missing,
                ref missing, ref missing, ref missing, ref missing, ref missing,
                ref missing, ref missing, ref missing);

            // Gather all paragraphs that are chapter headers, sorted by their start position
            var headers = (from Word.Paragraph p in WorkingDoc.Paragraphs
                           where IsHeading(p)
                           select new Heading()
                           {
                               Text = GetHeading(p),
                               Start = p.Range.Start
                           }).ToList().OrderBy(h => h.Start);

            reviewItems.AddRange(FindComments(headers));

            // I will be doing similar things with Revisions in the document
        }
        catch (Exception x)
        {
            MessageBox.Show(x.ToString(), 
                "Error while collecting review items", 
                MessageBoxButtons.OK, 
                MessageBoxIcon.Error);
        }
        finally
        {
            if (wordApp != null)
            {
                object doNotSave = Word.WdSaveOptions.wdDoNotSaveChanges;
                wordApp.Quit(ref doNotSave, ref missing, ref missing);
            }
        }
        ReviewResult result = new ReviewResult();
        result.Items = reviewItems.OrderBy(i => i.Position);
        return result;
    }

    /// <summary>
    /// Finds all comments in the document and converts them to review items
    /// </summary>
    /// <returns>List of ReviewItems generated from comments</returns>
    private List<ReviewItem> FindComments(IOrderedEnumerable<Heading> headers)
    {
        List<ReviewItem> result = new List<ReviewItem>();

        // Generate ReviewItems from the comments in the documents
        var reviewItems = from Word.Comment c in WorkingDoc.Comments
                          select new ReviewItem()
                          {
                              Position = c.Scope.Start,
                              Page = GetPageNumberOfRange(c.Scope),
                              Paragraph = GetHeaderForRange(headers, c.Scope),
                              Description = c.Range.Text,
                              ItemType = DetermineCommentType(c)
                          };

        return reviewItems.ToList();
    }

    /// <summary>
    /// Brute force translation of comment type based on the contents...
    /// </summary>
    /// <param name="c"></param>
    /// <returns></returns>
    private static string DetermineCommentType(Word.Comment c)
    {
        // This code is very specific to my solution, might be made more flexible/configurable
        // For now, this works :-)

        string text = c.Range.Text.ToLower();

        if (text.EndsWith("?"))
        {
            return "Vraag";
        }
        if (text.Contains("spelling") || text.Contains("spelfout"))
        {
            return "Spelling";
        }
        if (text.Contains("typfout") || text.Contains("typefout"))
        {
            return "Typefout";
        }
        if (text.ToLower().Contains("omissie"))
        {
            return "Omissie";
        }

        return "Opmerking";
    }

    /// <summary>
    /// Determine the last header before the given range's start position. That would be the chapter the range is part of.
    /// </summary>
    /// <param name="headings">List of headings as identified in the document.</param>
    /// <param name="range">The current range</param>
    /// <returns></returns>
    private static string GetHeaderForRange(IEnumerable<Heading> headings, Word.Range range)
    {
        var found = (from h in headings
                     where h.Start <= range.Start
                     select h).LastOrDefault();

        if (found != null)
        {
            return found.Text;
        }
        return "Unknown";
    }

    /// <summary>
    /// Identifies whether a paragraph is a heading, based on its styling.
    /// Note: the documents we're reviewing are always in a certain format, we can assume that headers
    /// have a style named "Heading..." or "Kop..."
    /// </summary>
    /// <param name="paragraph">The paragraph to be evaluated.</param>
    /// <returns></returns>
    private static bool IsHeading(Word.Paragraph paragraph)
    {
        Word.Style style = paragraph.get_Style() as Word.Style;
        return (style != null && style.NameLocal.StartsWith("Heading") || style.NameLocal.StartsWith("Kop"));
    }

    /// <summary>
    /// Translates a paragraph into the form we want to see: preferably the chapter/paragraph number, otherwise the
    /// title itself will do.
    /// </summary>
    /// <param name="paragraph">The paragraph to be translated</param>
    /// <returns></returns>
    private static string GetHeading(Word.Paragraph paragraph)
    {
        string heading = "";

        // Try to get the list number, otherwise just take the entire heading text
        heading = paragraph.Range.ListFormat.ListString;
        if (string.IsNullOrEmpty(heading))
        {
            heading = paragraph.Range.Text;
            heading = Regex.Replace(heading, "\s+$", "");
        }
        return heading;
    }

    /// <summary>
    /// Determines the pagenumber of a range.
    /// </summary>
    /// <param name="range">The range to be located.</param>
    /// <returns></returns>
    private static int GetPageNumberOfRange(Word.Range range)
    {
        return (int)range.get_Information(Word.WdInformation.wdActiveEndPageNumber);
    }
}

这篇关于VSTO 2007:如何确定范围的页码和段落编号?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆