计数页Word文档中 [英] Counting pages in a Word document

查看:335
本文介绍了计数页Word文档中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图从计数用java word文档的页面。

I'm trying to count pages from a word document with java.

这是我的实际code,我使用Apache POI库

This is my actual code, i'm using the Apache POI libraries

String path1 = "E:/iugkh";
File f = new File(path1);
File[] files = f.listFiles();
int pagesCount = 0;
for (int i = 0; i < files.length; i++) {
    POIFSFileSystem fis = new POIFSFileSystem(new FileInputStream(files[i]));
    HWPFDocument wdDoc = new HWPFDocument(fis);
    int pagesNo = wdDoc.getSummaryInformation().getPageCount();
    pagesCount += pagesNo;
    System.out.println(files[i].getName()+":\t"+pagesNo);
}

输出是:

ten.doc:    1
twelve.doc: 1
nine.doc:   1
one.doc:    1
eight.doc:  1
4teen.doc:  1
5teen.doc:  1
six.doc:    1
seven.doc:  1

这是不是我所期望的,作为第一个三个文件页面长度为4个,其余是从1到5页长。

And this is not what i expected, as the first three documents' page length is 4 and the other are from 1 to 5 pages long.

我想什么?

我是否必须使用另一个库来正确计算网页?

Do i have to use another library to count the pages correctly?

在此先感谢

推荐答案

这可能会帮助你。它计算饲料(有时被用于单独的页面)的数量,但我​​不知道这是否是所有的文件要去工作(我想事实并非如此)。

This may help you. It counts the number of form feeds (sometimes used to separate pages), but I'm not sure if it's gonna work for all documents (I guess it does not).

WordExtractor extractor = new WordExtractor(document);
String[] paragraphs = extractor.getParagraphText();

int pageCount = 1;
for (int i = 0; i < paragraphs.length; ++i) {
    if (paragraphs[i].indexOf("\f") >= 0) {
        ++pageCount;
    }
}

System.out.println(pageCount);

这篇关于计数页Word文档中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆