阅读Excel并获取嵌入式对象的行号和文件扩展名 [英] Read Excel and get the row number and file extension for embedded objects

查看:136
本文介绍了阅读Excel并获取嵌入式对象的行号和文件扩展名的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在我的应用程序中将Apache POI与Jav​​a 1.8一起使用.在我的应用程序中,我尝试读取Excel并获取嵌入的对象.

I am using Apache POI with Java 1.8 in my application. In my application, I try to read Excel and get the embedded objects.

我需要知道如何获取每个嵌入式OLE对象的行号和文件扩展名.

I need to know how to get the row number and file extensions for each embedded OLE object.

Workbook workbook = WorkbookFactory.create(new File(path));

XSSFWorkbook fWorkbook = (XSSFWorkbook) workbook;

List<PackagePart> embeddedDocs = fWorkbook.getAllEmbedds();

获取embeddedDocs.getContentType,该embeddedDocs.getContentType返回application/vnd.openxmlformats-officedocument.oleObject.

但是无论如何,我们都可以获取MimeType返回的文件扩展名(即pdf,ppt,mp3).以及哪种方式获取嵌入对象的行数.解决这个问题的任何想法/编码逻辑都将非常有用.

But is there anyway where we can get the file extensions (i.e pdf,ppt,mp3) as which is returned by the MimeType. And which way to get row number of embedded objects. Any ideas / Coding logic to resolve this will be very useful.

推荐答案

以下内容适用于-我想-通常是可疑的.我已经在POI中继上使用.xls/x对其进行了测试,这将是POI 4.1.0,但也应该可以在POI 4.0.1中使用.

The following is working for - I guess - the usual suspects. I've tested it with .xls/x on the POI trunk, which will be POI 4.1.0, but should work with POI 4.0.1 too.

已知问题是:

  • 对象不是基于文件嵌入的,因此您没有文件名.这可能也适用于大多数.xls文件.

  • an object wasn't embedded based on a file, then you don't get a filename. This probably also applies to most .xls files.

.xlsx仅包含vmlDrawing * .xml,因此无法提取DrawingPatriach并且无法确定形状

a .xlsx only contains a vmlDrawing*.xml, then the DrawingPatriach can't be extracted and no shapes can be determined

.xlsx中的形状不是通过twoCellAnchor锚定的,那么您就不会获得ClientAnchor

a shape in a .xlsx wasn't anchored via twoCellAnchor, then you don't get a ClientAnchor

代码:

import java.io.FileInputStream;
import java.io.IOException;

import org.apache.poi.hpsf.ClassIDPredefined;
import org.apache.poi.poifs.filesystem.DirectoryEntry;
import org.apache.poi.ss.usermodel.ChildAnchor;
import org.apache.poi.ss.usermodel.ClientAnchor;
import org.apache.poi.ss.usermodel.Drawing;
import org.apache.poi.ss.usermodel.ObjectData;
import org.apache.poi.ss.usermodel.Shape;
import org.apache.poi.ss.usermodel.Sheet;
import org.apache.poi.ss.usermodel.Workbook;
import org.apache.poi.ss.usermodel.WorkbookFactory;
import org.junit.Test;

public class TestEmbed {
    @Test
    public void extract() throws IOException {
//        String xlsName = "test-data/spreadsheet/WithEmbeddedObjects.xls";
        String xlsName = "embed.xlsx";
        try (FileInputStream fis = new FileInputStream(xlsName);
             Workbook xls = WorkbookFactory.create(fis)) {
            for (Sheet s : xls) {
                Drawing<?> dp = s.getDrawingPatriarch();
                if (dp != null) {
                    for (Shape sh : dp) {
                        if (sh instanceof ObjectData) {
                            ObjectData od = (ObjectData)sh;
                            String filename = od.getFileName();
                            String ext = null;

                            if (filename != null && !filename.isEmpty()) {
                                int i = filename.lastIndexOf('.');
                                ext = (i > 0) ? filename.substring(i) : ".bin";
                            } else {
                                String ct = null;

                                try {
                                    DirectoryEntry de = od.getDirectory();

                                    if (de != null) {
                                        ClassIDPredefined ctcls = ClassIDPredefined.lookup(de.getStorageClsid());
                                        if (ctcls != null) {
                                            ext = ctcls.getFileExtension();
                                        }
                                    }
                                } catch (Exception ignore) {
                                }
                            }

                            if (ext == null) {
                                ext = ".bin";
                            }

                            ChildAnchor chAnc = sh.getAnchor();
                            if (chAnc instanceof ClientAnchor) {
                                ClientAnchor anc = (ClientAnchor) chAnc;
                                System.out.println("Rows: " + anc.getRow1() + " to " + anc.getRow2() + " - filename: "+filename+" - ext: "+ext);
                            }
                        }
                    }
                }
            }
        }
    }
}

这篇关于阅读Excel并获取嵌入式对象的行号和文件扩展名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆