如何使用Poi读取doc文件? [英] How to read doc file using Poi?

查看:34
本文介绍了如何使用Poi读取doc文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在我的编辑器窗格中查看 word 文件我试过这些行

I am trying to view word file in my editor pane I tried these lines

import java.awt.Dimension;
import java.awt.GridLayout;
import java.io.File;
import java.io.FileInputStream;
import javax.swing.JEditorPane;
import org.apache.poi.hwpf.HWPFDocument;
import org.apache.poi.hwpf.extractor.WordExtractor;

public class editorpane extends JEditorPane
{
public editorpane(File file)
{

    try
    {
        FileInputStream fis = new FileInputStream(file.getAbsolutePath());
        HWPFDocument hwpfd = new HWPFDocument(fis);
        WordExtractor we = new WordExtractor(hwpfd);
        String[] array = we.getParagraphText();
        for (int i = 0; i < array.length; i++)
        {
            this.setPage(array[i]);
        }

    } catch (Exception e)
    {
        e.printStackTrace();
    }

但是给了我

org.apache.poi.poifs.filesystem.OfficeXmlFileException: The supplied data appears to be in the Office 2007+ XML. You are calling the part of POI that deals with OLE2 Office Documents. You need to call a different part of POI to process this data (eg XSSF instead of HSSF)
at org.apache.poi.poifs.storage.HeaderBlock.<init>(HeaderBlock.java:131)
at org.apache.poi.poifs.storage.HeaderBlock.<init>(HeaderBlock.java:104)
at org.apache.poi.poifs.filesystem.POIFSFileSystem.<init>(POIFSFileSystem.java:138)
at org.apache.poi.hwpf.HWPFDocumentCore.verifyAndBuildPOIFS(HWPFDocumentCore.java:106)
at org.apache.poi.hwpf.HWPFDocument.<init>(HWPFDocument.java:174)
at frame1.editorpane.<init>(editorpane.java:24)

在这一行

HWPFDocument hwpfd = new HWPFDocument(fis);

我该如何解决??

除此之外我不确定这些行

beside I am not sure about these lines

for (int i = 0; i < array.length; i++)
        {
            this.setPage(array[i]);
        }

我可以确认他们吗??

推荐答案

您正在尝试打开 .docx 文件 (XWPF),其中包含 .doc (HWPF) 文件的代码.您可以将 XWPFWordExtractor 用于 .docx 文件.

You are trying to open a .docx file (XWPF) with code for .doc (HWPF) files. You can use XWPFWordExtractor for .docx files.

有一个 ExtractorFactory,您可以使用它来让 POI 决定哪些适用并使用正确的类来打开文件,但是您不能仅将页面迭代为通用 getText() 方法然后可用.

There is an ExtractorFactory which you can use to let POI decide which of these applies and uses the correct class to open the file, however you can then not iterate by page as only a generic getText() method is available then.

像这样使用

POITextExtractor extractor = ExtractorFactory.createExtractor(file);
extractor.getText();

这篇关于如何使用Poi读取doc文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆