使用apache poi在java中解析MS Word Doc时如何知道图像或图片位置 [英] How to know the Image or Picture Location while parsing MS Word Doc in java using apache poi

查看:23
本文介绍了使用apache poi在java中解析MS Word Doc时如何知道图像或图片位置的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

HWPFDocument wordDoc = new HWPFDocument(new FileInputStream(fileName));
List<Picture> picturesList = wordDoc.getPicturesTable().getAllPictures();

上面的语句给出了一个文档中所有图片的列表.我想知道图像将位于文档中的哪个文本/位置之后?

The above statement gives the list of all pictures inside a document. I want to know after which text/position in the doc the image will be located at?

推荐答案

你看图片的方式不对,这就是你找不到任何位置的原因!

You're getting at the pictures the wrong way, which is why you're not finding any positions!

你需要做的是处理每个CharacterRun 依次指向文档.将其传递给 PicturesTable,然后检查如果角色 run 有图片.如果有,从表中取回图片,你就知道它在文档中的位置,因为你有 run 它来自

What you need to do is process each CharacterRun of the document in turn. Pass that to the PicturesTable, and check if the character run has a picture in. If it does, fetch back the picture from the table, and you know where in the document it belongs as you have the run it comes from

最简单的情况是:

PicturesSource pictures = new PicturesSource(document);
PicturesTable pictureTable = document.getPicturesTable();

Range r = document.getRange();
for(int i=0; i<r.numParagraphs(); i++) {
    Paragraph p = r.getParagraph(i);
    for(int j=0; j<p.numCharacterRuns(); j++) {
      CharacterRun cr = p.getCharacterRun(j);
      if (pictureTable.hasPicture(cr)) {
         Picture picture = pictures.getFor(cr);
         // Do something useful with the picture
      }
    }
}

您可以在 Microsoft Word .doc 的 Apache Tika 解析器,由 Apache POI 提供支持

You can find a good example of doing this in the Apache Tika parser for Microsoft Word .doc, which is powered by Apache POI

这篇关于使用apache poi在java中解析MS Word Doc时如何知道图像或图片位置的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆