PDFBox:将 pdf 页面转换为图像的问题 [英] PDFBox: Problem with converting pdf page into image
问题描述
我的任务非常简单:将 pdf 文件的每一页都转换为图像.我尝试使用icepdf开源版本来生成图像,但他们没有用正确的字体生成图像.所以我开始使用 PDFBox.代码如下:
My mission is pretty simple: converting every single page of a pdf file into images. I tried using icepdf open source version to generate the images but they don't generate the image with the correct font. So I start using PDFBox instead. The code is the following:
PDDocument document = PDDocument.load(new File("testing.pdf"));
List<PDPage> pages = document.getDocumentCatalog().getAllPages();
for (int i = 0; i < pages.size(); i++) {
PDPage singlePage = pages.get(i);
BufferedImage buffImage = convertToImage(singlePage, 8, 12);
ImageIO.write(buffImage, "png", new File(PdfUtil.DATA_OUTPUT_DIR+(count++)+".png"));
}
字体看起来不错,但pdf文件中的图片看起来有点模糊(见附件).我查看了源代码,但我仍然不知道如何修复它.你们知道发生了什么吗?请帮忙.谢谢!!
The font looks good, but the pictures within the pdf file look fainted out (See the attachment). I look into the source code but I still have no clue how to fix it. Do you guys have any idea what's going on? Please help. Thanks!!
推荐答案
转换 PDF 文件 04-Request-Headers.pdf 使用 pdfbox 进行图像处理.
Convert PDF file 04-Request-Headers.pdf to image using pdfbox.
下载此文件并将其粘贴到 Documents
文件夹中.
Download this file and paste it in Documents
folder.
示例:
package com.pdf.pdfbox.test;
import java.awt.HeadlessException;
import java.awt.Toolkit;
import java.awt.image.BufferedImage;
import java.io.File;
import java.util.List;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.util.PDFImageWriter;
public class ConvertPDFPageToImageWithoutText {
public static void main(String[] args) {
try {
String oldPath = "C:/Documents/04-Request-Headers.pdf";
File oldFile = new File(oldPath);
if (oldFile.exists()) {
PDDocument document = PDDocument.load(oldPath);
@SuppressWarnings("unchecked")
List<PDPage> list = document.getDocumentCatalog().getAllPages();
String fileName = oldFile.getName().replace(".pdf", "");
String imageFormat = "png";
String password = "";
int startPage = 1;
int endPage = list.size();
String outputPrefix = "C:/Documents/PDFCopy/";//converted images saved here
File file = new File(outputPrefix);
if (!file.exists()) {
file.mkdirs();
}
int imageType = 24;
String color = "rgb";
int resolution;
try {
resolution = Toolkit.getDefaultToolkit().getScreenResolution();
} catch (HeadlessException e) {
resolution = 96;
}
if ("bilevel".equalsIgnoreCase(color)) {
imageType = BufferedImage.TYPE_BYTE_BINARY;
} else if ("indexed".equalsIgnoreCase(color)) {
imageType = BufferedImage.TYPE_BYTE_INDEXED;
} else if ("gray".equalsIgnoreCase(color)) {
imageType = BufferedImage.TYPE_BYTE_GRAY;
} else if ("rgb".equalsIgnoreCase(color)) {
imageType = BufferedImage.TYPE_INT_RGB;
} else if ("rgba".equalsIgnoreCase(color)) {
imageType = BufferedImage.TYPE_INT_ARGB;
} else {
System.err.println("Error: the number of bits per pixel must be 1, 8 or 24.");
}
PDFImageWriter pdfImageWriter = new PDFImageWriter();
boolean imageWriter = pdfImageWriter.writeImage(document, imageFormat, password, startPage, endPage, outputPrefix + fileName, imageType, resolution);
if (!imageWriter) {
throw new Exception("No writer found for format '" + imageFormat + "'");
}
document.close();
} else {
System.err.println(oldPath +" File Can't be found");
}
} catch (Exception e) {
e.printStackTrace();
}
}
}
或
尝试以下解决方案将 pdf 文件转换为图像格式.
Try the below solution for convert pdf files to image format.
如何使用 PDF Renderer 在 Java 中将 PDF 转换为具有分辨率的图像
这篇关于PDFBox:将 pdf 页面转换为图像的问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!