错误:无法将org.apache.pdfbox.pdmodel.graphics.xobject.PDXObjectForm强制转换为org.apache.pdfbox.pdmodel.graphics.xobject.PDXObjectImage [英] Error: org.apache.pdfbox.pdmodel.graphics.xobject.PDXObjectForm cannot be cast to org.apache.pdfbox.pdmodel.graphics.xobject.PDXObjectImage

查看：516 发布时间：2020/5/25 5:31:00 java pdf pdfbox

本文介绍了错误:无法将org.apache.pdfbox.pdmodel.graphics.xobject.PDXObjectForm强制转换为org.apache.pdfbox.pdmodel.graphics.xobject.PDXObjectImage的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试使用pdfbox从pdf中提取图像.我从此帖子中获取了帮助.它适用于某些pdf，但不适用于其他/大多数.例如，我无法提取此文件

I am trying to extract image from the pdf using pdfbox. I have taken help from this post . It worked for some of the pdfs but for others/most it did not. For example, I am not able to extract the figures in this file

进行一些研究后，我发现PDResources.getImages已被弃用.因此，我正在使用PDResources.getXObjects().这样一来，我将无法从PDF中提取任何图像，而是在控制台上收到此消息:

After doing some research I found that PDResources.getImages is deprecated. So, I am using PDResources.getXObjects(). With this, I am not able to extract any image from the PDF and instead get this message at the console:

org.apache.pdfbox.pdmodel.graphics.xobject.PDXObjectForm cannot be cast to org.apache.pdfbox.pdmodel.graphics.xobject.PDXObjectImage

现在，我被卡住了，无法找到解决方案.如果有人可以，请提供帮助.

Now I am stuck and unable to find the solution. Please assist if anyone can.

//////根据评论更新////

//////UPDATE AS REPLY ON COMMENTS///

我正在使用pdfbox-1.8.10

I am using pdfbox-1.8.10

这是代码:

public void getimg ()throws Exception {

try {
        String sourceDir = "C:/Users/admin/Desktop/pdfbox/mypdfbox/pdfbox/inputs/Yavaa.pdf";
        String destinationDir = "C:/Users/admin/Desktop/pdfbox/mypdfbox/pdfbox/outputs/";
        File oldFile = new File(sourceDir);
        if (oldFile.exists()){
              PDDocument document = PDDocument.load(sourceDir);
               List<PDPage> list =   document.getDocumentCatalog().getAllPages();
               String fileName = oldFile.getName().replace(".pdf", "_cover");
               int totalImages = 1;
               for (PDPage page : list) {
                   PDResources pdResources = page.getResources();
                   Map pageImages = pdResources.getXObjects();
                    if (pageImages != null){
                      Iterator imageIter = pageImages.keySet().iterator();
                      while (imageIter.hasNext()){
                      String key = (String) imageIter.next();
                      Object obj = pageImages.get(key);

                      if(obj instanceof PDXObjectImage) {
               PDXObjectImage pdxObjectImage = (PDXObjectImage) obj;

                         pdxObjectImage.write2file(destinationDir + fileName+ "_" + totalImages);

                     totalImages++;
                      }
                      }
                    }
               }
        }  else {
                    System.err.println("File not exist");
                       }  
}
catch (Exception e){

    System.err.println(e.getMessage());
 }
 }

////部分解决方案/////

//// PARTIAL SOLUTION/////

我已经解决了错误消息的问题.我也更新了帖子中的正确代码.但是，问题仍然相同.我仍然无法从几个文件中提取图像.像那个一样，我在这篇文章中已经提到过.在这方面的任何解决方案.

I have solved the problem of the error message. I have updated the correct code in the post as well. However, the problem remains the same. I am still not able to extract the images from few of the files. Like the one, I have mentioned in this post. Any solution in that regards.

错误:无法将org.apache.pdfbox.pdmodel.graphics.xobject.PDXObjectForm强制转换为org.apache.pdfbox.pdmodel.graphics.xobject.PDXObjectImage [英] Error: org.apache.pdfbox.pdmodel.graphics.xobject.PDXObjectForm cannot be cast to org.apache.pdfbox.pdmodel.graphics.xobject.PDXObjectImage

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

错误:无法将org.apache.pdfbox.pdmodel.graphics.xobject.PDXObjectForm强制转换为org.apache.pdfbox.pdmodel.graphics.xobject.PDXObjectImage [英] Error: org.apache.pdfbox.pdmodel.graphics.xobject.PDXObjectForm cannot be cast to org.apache.pdfbox.pdmodel.graphics.xobject.PDXObjectImage

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭