PDFBox:将pdf页面转换为图像的问题 [英] PDFBox: Problem with converting pdf page into image
问题描述
我的任务非常简单:将pdf文件的每一页转换为图像。我尝试使用icepdf开源版本来生成图像,但它们不会生成具有正确字体的图像。所以我开始使用PDFBox。代码如下:
My mission is pretty simple: converting every single page of a pdf file into images. I tried using icepdf open source version to generate the images but they don't generate the image with the correct font. So I start using PDFBox instead. The code is the following:
PDDocument document = PDDocument.load(new File("testing.pdf"));
List<PDPage> pages = document.getDocumentCatalog().getAllPages();
for (int i = 0; i < pages.size(); i++) {
PDPage singlePage = pages.get(i);
BufferedImage buffImage = convertToImage(singlePage, 8, 12);
ImageIO.write(buffImage, "png", new File(PdfUtil.DATA_OUTPUT_DIR+(count++)+".png"));
}
字体看起来不错,但pdf文件中的图片看起来晕了过去(见附件)。我查看源代码,但我仍然不知道如何解决它。你们有什么想法发生了什么事吗?请帮忙。谢谢!!
The font looks good, but the pictures within the pdf file look fainted out (See the attachment). I look into the source code but I still have no clue how to fix it. Do you guys have any idea what's going on? Please help. Thanks!!
推荐答案
转换PDF文件 04-Request-Headers.pdf 使用pdfbox进行图像处理。
Convert PDF file 04-Request-Headers.pdf to image using pdfbox.
下载此文件并将其粘贴到文件
文件夹。
Download this file and paste it in Documents
folder.
示例:
package com.pdf.pdfbox.test;
import java.awt.HeadlessException;
import java.awt.Toolkit;
import java.awt.image.BufferedImage;
import java.io.File;
import java.util.List;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.util.PDFImageWriter;
public class ConvertPDFPageToImageWithoutText {
public static void main(String[] args) {
try {
String oldPath = "C:/Documents/04-Request-Headers.pdf";
File oldFile = new File(oldPath);
if (oldFile.exists()) {
PDDocument document = PDDocument.load(oldPath);
@SuppressWarnings("unchecked")
List<PDPage> list = document.getDocumentCatalog().getAllPages();
String fileName = oldFile.getName().replace(".pdf", "");
String imageFormat = "png";
String password = "";
int startPage = 1;
int endPage = list.size();
String outputPrefix = "C:/Documents/PDFCopy/";//converted images saved here
File file = new File(outputPrefix);
if (!file.exists()) {
file.mkdirs();
}
int imageType = 24;
String color = "rgb";
int resolution;
try {
resolution = Toolkit.getDefaultToolkit().getScreenResolution();
} catch (HeadlessException e) {
resolution = 96;
}
if ("bilevel".equalsIgnoreCase(color)) {
imageType = BufferedImage.TYPE_BYTE_BINARY;
} else if ("indexed".equalsIgnoreCase(color)) {
imageType = BufferedImage.TYPE_BYTE_INDEXED;
} else if ("gray".equalsIgnoreCase(color)) {
imageType = BufferedImage.TYPE_BYTE_GRAY;
} else if ("rgb".equalsIgnoreCase(color)) {
imageType = BufferedImage.TYPE_INT_RGB;
} else if ("rgba".equalsIgnoreCase(color)) {
imageType = BufferedImage.TYPE_INT_ARGB;
} else {
System.err.println("Error: the number of bits per pixel must be 1, 8 or 24.");
}
PDFImageWriter pdfImageWriter = new PDFImageWriter();
boolean imageWriter = pdfImageWriter.writeImage(document, imageFormat, password, startPage, endPage, outputPrefix + fileName, imageType, resolution);
if (!imageWriter) {
throw new Exception("No writer found for format '" + imageFormat + "'");
}
document.close();
} else {
System.err.println(oldPath +" File Can't be found");
}
} catch (Exception e) {
e.printStackTrace();
}
}
}
OR
尝试使用以下解决方案将pdf文件转换为图片格式。
Try the below solution for convert pdf files to image format.
How to Convert PDF to image with resolution in java Using PDF Renderer
这篇关于PDFBox:将pdf页面转换为图像的问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!