如何使用apache poi获取doc，docx文件中特定单词的行号，页码? [英] How to get the line number, page number of a particular word in a doc,docx file using apache poi?

查看：66 发布时间：2021/11/12 4:55:03 java swing apache-poi

本文介绍了如何使用apache poi获取doc，docx文件中特定单词的行号，页码?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试创建一个 java 应用程序，它将在选定的 doc, docx 文件中搜索特定的单词并生成报告.该报告将包含搜索词的页码和行号.现在我所取得的成就是我能够逐段阅读 doc 和 docx 文件.但是我没有找到任何方法来搜索特定单词并获得 line &该词所在的页码.我搜索了很多，但直到现在都没有运气.希望有人知道如何做到这一点.

I am trying to create a java application which would search for particular word in the selected doc, docx file and generates a report on it. That report will contain the page number and the line number of the searched word. Now that what I have achieved is I am able to read the doc and docx file by paragraph. But I didn't find any way to search for a particular word and to get the line & page number where that word is present. I searched a lot but no luck till now. Hope someone knows the way to do this.

这是我的代码

if(fc.getSelectedFile().getAbsolutePath().contains("docx")) {
    File file = fc.getSelectedFile();
    FileInputStream fis = new FileInputStream(file.getAbsolutePath());
    XWPFDocument document = new XWPFDocument(fis);
    List<XWPFParagraph> paragraphs = document.getParagraphs();
    System.out.println("Total no of paragraph "+paragraphs.size());
    for (XWPFParagraph para : paragraphs) {
        System.out.println(para.getText());
    }
    fis.close();
} else {
    WordExtractor extractor = null;
    FileInputStream fis = new FileInputStream(fc.getSelectedFile());
    HWPFDocument document = new HWPFDocument(fis);
    extractor = new WordExtractor(document);
    String[] fileData = extractor.getParagraphText();
    for (int i = 0; i < fileData.length; i++) {
        if (fileData[i] != null)
            System.out.println(fileData[i]);
    }
    extractor.close();
}

我正在使用 swing, apache poi 3.10.1.

如何使用apache poi获取doc，docx文件中特定单词的行号，页码? [英] How to get the line number, page number of a particular word in a doc,docx file using apache poi?

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

如何使用apache poi获取doc，docx文件中特定单词的行号，页码? [英] How to get the line number, page number of a particular word in a doc,docx file using apache poi?

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭