如何使用 Apache POI 从 xls 文件中获取带有名称的图片 [英] How to get pictures with names from an xls file using Apache POI

查看:31
本文介绍了如何使用 Apache POI 从 xls 文件中获取带有名称的图片的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用 workbook.getAllPictures() 我可以得到一组 图片数据 但不幸的是它只是数据,那些对象没有访问图片名称或任何其他相关信息的方法.

Using workbook.getAllPictures() I can get an array of picture data but unfortunately it is only the data and those objects have no methods for accessing the name of the picture or any other related information.

有一个 HSSFPicture class 它将包含图片的所有细节,但如何从 xls 中获取这些对象的数组?

There is a HSSFPicture class which would contain all the details of the picture but how to get for example an array of those objects from the xls?

更新:

发现 SO 问题 如何在 apache poi 中找到包含图片的单元格,该单元格具有循环遍历工作表中所有图片的方法.那行得通.

Found SO question How can I find a cell, which contain a picture in apache poi which has a method for looping through all the pictures in the worksheet. That works.

现在我可以尝试 HSSFPicture 类,我发现 getFileName() 方法返回的文件名没有扩展名.我可以使用 getPictureData().suggestFileExtension() 来获取建议的文件扩展名,但我确实需要获取将图片添加到 xls 文件时的扩展名.有办法获得吗?

Now that I was able to try the HSSFPicture class I found out that the getFileName() method is returning the file name without the extension. I can use the getPictureData().suggestFileExtension() to get a suggested file extension but I really would need to get the extension the picture had when it was added into the xls file. Would there be a way to get it?

更新 2:

图片是用宏添加到xls中的.这是将图像添加到工作表中的宏的一部分.fname 是完整路径,imageName 是文件名,两者都包括扩展名.

The pictures are added into the xls with a macro. This is the part of macro that is adding the images into the sheet. fname is the full path and imageName is the file name, both are including the extension.

Set img = Sheets("Receipt images").Pictures.Insert(fname)
img.Left = 10
img.top = top + 10
img.Name = imageName
Set img = Nothing

检查图片是否已存在于 Excel 文件中的例程.

The routine to check if the picture already exists in the Excel file.

For Each img In Sheets("Receipt images").Shapes
    If img.Name = imageName Then
        Set foundImage = img
        Exit For
    End If
Next

这会识别出image.jpg"与image.gif"不同,因此img.Name 包含扩展名.

This recognizes that "image.jpg" is different from "image.gif", so the img.Name includes the extension.

推荐答案

形状名称不在默认 POI 对象中.因此,如果我们需要它们,我们必须处理底层对象.那是针对 HSSF 中的形状,主要是 EscherAggregate (http://poi.apache.org/apidocs/org/apache/poi/hssf/record/EscherAggregate.html),我们可以从工作表中获取.从它的父类 AbstractEscherHolderRecord 我们可以得到所有包含形状选项的 EscherOptRecords.在这些选项中也可以找到 groupshape.shapenames.

The shape names are not in the default POI objects. So if we need them we have to deal with the underlying objects. That is for the shapes in HSSF mainly the EscherAggregate (http://poi.apache.org/apidocs/org/apache/poi/hssf/record/EscherAggregate.html) which we can get from the sheet. From its parent class AbstractEscherHolderRecord we can get all EscherOptRecords which contains the options of the shapes. In those options are also to find the groupshape.shapenames.

我的例子不是完整的解决方案.它仅用于显示可使用哪些对象来实现此目的.

My example is not the complete solution. It is only provided to show which objects could be used to achieve this.

示例:

import org.apache.poi.hssf.usermodel.*;
import org.apache.poi.ss.usermodel.*;

import java.io.FileOutputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.FileInputStream;
import java.io.InputStream;

import org.apache.poi.openxml4j.exceptions.InvalidFormatException;

import org.apache.poi.hssf.record.*;
import org.apache.poi.ddf.*;

import java.util.List;
import java.util.ArrayList;

class ShapeNameTestHSSF {

 public static void main(String[] args) {
  try {

   InputStream inp = new FileInputStream("workbook1.xls");
   Workbook wb = WorkbookFactory.create(inp);

   Sheet sheet = wb.getSheetAt(0);

   EscherAggregate escherAggregate = ((HSSFSheet)sheet).getDrawingEscherAggregate();

   EscherContainerRecord escherContainer = escherAggregate.getEscherContainer().getChildContainers().get(0); 
   //throws java.lang.NullPointerException if no Container present

   List<EscherRecord> escherOptRecords = new ArrayList<EscherRecord>();

   escherContainer.getRecordsById(EscherOptRecord.RECORD_ID, escherOptRecords);

   for (EscherRecord escherOptRecord : escherOptRecords) {
    for (EscherProperty escherProperty : ((EscherOptRecord)escherOptRecord).getEscherProperties()) {
     System.out.println(escherProperty.getName());
     if (escherProperty.isComplex()) {
      System.out.println(new String(((EscherComplexProperty)escherProperty).getComplexData(), "UTF-16LE"));
     } else {
      if (escherProperty.isBlipId()) System.out.print("BlipId = ImageId = ");
      System.out.println(((EscherSimpleProperty)escherProperty).getPropertyValue());
     }
     System.out.println("=============================");
    }
    System.out.println(":::::::::::::::::::::::::::::");
   }


   FileOutputStream fileOut = new FileOutputStream("workbook1.xls");
   wb.write(fileOut);
   fileOut.flush();
   fileOut.close();

  } catch (InvalidFormatException ifex) {
  } catch (FileNotFoundException fnfex) {
  } catch (IOException ioex) {
  }
 }
}

再说一次:这不是一个现成的解决方案.由于 EscherRecords 的复杂性,这里无法提供现成的解决方案.也许要获得图像形状及其相关 EscherOptRecords 的正确 EscherRecords,您必须递归遍历 EscherAggregate 中的所有 EscherRecords,检查它们是否是 ContainerRecords,如果是,则遍历其子代等等.

Again: This is not a ready to use solution. A ready to use solution cannot be provided here, because of the complexity of the EscherRecords. Maybe to get the correct EscherRecords for the image shapes and their related EscherOptRecords, you have recursive to loop through all EscherRecords in the EscherAggregate checking whether they are ContainerRecords and if so loop through its children and so on.

这篇关于如何使用 Apache POI 从 xls 文件中获取带有名称的图片的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆