如何使用 JAVA POI 将 .doc 拆分为多个 .doc? [英] How to split a .doc into several .doc using JAVA POI?

查看:32
本文介绍了如何使用 JAVA POI 将 .doc 拆分为多个 .doc?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用POI 读取.doc 文件,我想选择一些内容来形成新的.doc 文件.具体来说,是否可以将范围"中的段落"的内容写入新文件中?谢谢.

I am using POI to read .doc files, and I want to select some of the contents to form new .doc files. Specifically speaking, is it possible to write the content of a "paragraph" in the "range" to a new file? Thank you.

HWPFDocument doc = new HWPFDocument(fs);
Range range = doc.getRange();
for (int i = 0; i < range.numParagraphs(); i++) {
    //here I wish to write the content in a Paragraph
    //into a new .doc file "doc1""doc2"
    //instead of doc.write(pathName) that only write one .doc file.
}

推荐答案

这里是适用于当前任务的代码.这里选择段落的标准非常简单:段落 11..20 转到文件us.docx",而 21..30 - 转到japan.docx".

So here is the code that works with the current task. Here the criteria of selecting paragraphs is quite simple: paragraphs 11..20 go to the file "us.docx", and 21..30 - to "japan.docx".

import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;

import org.apache.poi.hwpf.HWPFDocument;
import org.apache.poi.hwpf.usermodel.Paragraph;
import org.apache.poi.hwpf.usermodel.Range;
import org.apache.poi.xwpf.usermodel.XWPFDocument;
import org.apache.poi.xwpf.usermodel.XWPFParagraph;


public class SplitDocs {

    public static void main(String[] args) {

        FileInputStream in = null;
        HWPFDocument doc = null;

        XWPFDocument us = null;
        XWPFDocument japan = null;
        FileOutputStream outUs = null;
        FileOutputStream outJapan = null;

        try {
            in = new FileInputStream("wto.doc");
            doc = new HWPFDocument(in);

            us = new XWPFDocument();
            japan = new XWPFDocument();

            Range range = doc.getRange();

            for (int parIndex = 0; parIndex < range.numParagraphs(); parIndex++) {  
                Paragraph paragraph = range.getParagraph(parIndex);

                String text = paragraph.text();
                System.out.println("***Paragraph" + parIndex + ": " + text);

                if ( (parIndex >= 11) && (parIndex <= 20) ) {
                    createParagraphInAnotherDocument(us, text);
                } else if ( (parIndex >= 21) && (parIndex <= 30) ) {
                    createParagraphInAnotherDocument(japan, text);
                }
            }

            outUs = new FileOutputStream("us.docx");
            outJapan = new FileOutputStream("japan.docx");
            us.write(outUs);
            japan.write(outJapan);

            in.close();
            outUs.close();
            outJapan.close();

        } catch (IOException e) {
            e.printStackTrace();
        }

    }

    private static void createParagraphInAnotherDocument(XWPFDocument document, String text)  {         XWPFParagraph newPar = document.createParagraph();
        newPar.createRun().setText(text, 0);
    }

}

我使用 .docx 作为输出,因为将新段落添加到 .docx 比添加到 .doc 文件要容易得多.用于将新 Paragraph 插入给定 range 的方法 insertAfter(ParagraphProperties props, int styleIndex) 现在已弃用(我使用 POI 版本 3.10),我找不到在空的 .doc 文件中创建新 Paragraph 对象的简单且合乎逻辑的方法.而使用简单明了的 XWPFParagraph newPar = document.createParagraph(); 是一种乐趣.

I used .docx as the output as it is waaaaay easier to add new paragraphs to a .docx than to a .doc file. The method insertAfter(ParagraphProperties props, int styleIndex) for inserting a new Paragraph to a given range is now deprecated (i use POI version 3.10), and i couldn't find an easy and logical way to create a new Paragraph object in the empty .doc file. Whereas it's a pleasure to use straightforward and clean XWPFParagraph newPar = document.createParagraph();.

但是,根据您的任务的要求,此代码使用 .doc 作为输入.希望这会有所帮助:)

However, this code uses .doc as an input, as required in your task. Hope this will help :)

附言这里我们使用一个简单的选择标准,使用段落索引.如果您需要诸如字体标准之类的东西,正如您所说,您可能会发布另一个问题,或者您可能会自己找到解决方案.无论如何,有了 docx,事情变得更容易了.

P.S. Here we use a simple choosing criteria, using paragraph indices. If you need something like font criteria, as you said, you will probably post another questions, or maybe you'll find the solution yourself. Anyway, with docx things get easier.

这篇关于如何使用 JAVA POI 将 .doc 拆分为多个 .doc?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆