如何采用POI读取doc文件? [英] How to read doc file using Poi?

查看:605
本文介绍了如何采用POI读取doc文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图查看Word文件在我的编辑器窗格
我想这些行

I am trying to view word file in my editor pane I tried these lines

import java.awt.Dimension;
import java.awt.GridLayout;
import java.io.File;
import java.io.FileInputStream;
import javax.swing.JEditorPane;
import org.apache.poi.hwpf.HWPFDocument;
import org.apache.poi.hwpf.extractor.WordExtractor;

public class editorpane extends JEditorPane
{
public editorpane(File file)
{

    try
    {
        FileInputStream fis = new FileInputStream(file.getAbsolutePath());
        HWPFDocument hwpfd = new HWPFDocument(fis);
        WordExtractor we = new WordExtractor(hwpfd);
        String[] array = we.getParagraphText();
        for (int i = 0; i < array.length; i++)
        {
            this.setPage(array[i]);
        }

    } catch (Exception e)
    {
        e.printStackTrace();
    }

但给我

org.apache.poi.poifs.filesystem.OfficeXmlFileException: The supplied data appears to be in the Office 2007+ XML. You are calling the part of POI that deals with OLE2 Office Documents. You need to call a different part of POI to process this data (eg XSSF instead of HSSF)
at org.apache.poi.poifs.storage.HeaderBlock.<init>(HeaderBlock.java:131)
at org.apache.poi.poifs.storage.HeaderBlock.<init>(HeaderBlock.java:104)
at org.apache.poi.poifs.filesystem.POIFSFileSystem.<init>(POIFSFileSystem.java:138)
at org.apache.poi.hwpf.HWPFDocumentCore.verifyAndBuildPOIFS(HWPFDocumentCore.java:106)
at org.apache.poi.hwpf.HWPFDocument.<init>(HWPFDocument.java:174)
at frame1.editorpane.<init>(editorpane.java:24)

在此行

HWPFDocument hwpfd = new HWPFDocument(fis);

我怎么能解决?

旁边,我不知道对这些行

beside I am not sure about these lines

for (int i = 0; i < array.length; i++)
        {
            this.setPage(array[i]);
        }

我可以让他们证实了??

can I get them confirmed ??

推荐答案

您正在试图打开一个.docx文件(XWPF)与code为.DOC(HWPF)文件。您可以使用 XWPFWordExtractor 的.DOCX文件。

You are trying to open a .docx file (XWPF) with code for .doc (HWPF) files. You can use XWPFWordExtractor for .docx files.

有一个 ExtractorFactory ,你可以用它来让POI决定哪些这些应用并使用正确的类来打开该文件,但您可以不通过网页遍历因为只有一个通用的的getText()方法可用,那么

There is an ExtractorFactory which you can use to let POI decide which of these applies and uses the correct class to open the file, however you can then not iterate by page as only a generic getText() method is available then.

使用像这样

POITextExtractor extractor = ExtractorFactory.createExtractor(file);
extractor.getText();

这篇关于如何采用POI读取doc文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆