尝试使用 Apache poi 制作简单的 PDF 文档 [英] Trying to make simple PDF document with Apache poi
问题描述
我看到互联网上充斥着抱怨 apache pdf 产品的人,但我在这里找不到我的特定用例.我正在尝试用 apache poi 做一个简单的 Hello World.现在我的代码如下:
I see the internet is riddled with people complaining about apache's pdf products, but I cannot find my particular usecase here. I am trying to do a simple Hello World with apache poi. Right now my code is as follows:
public ByteArrayOutputStream export() throws IOException {
//Blank Document
XWPFDocument document = new XWPFDocument();
//Write the Document in file system
ByteArrayOutputStream out = new ByteArrayOutputStream();;
//create table
XWPFTable table = document.createTable();
XWPFStyles styles = document.createStyles();
styles.setSpellingLanguage("English");
//create first row
XWPFTableRow tableRowOne = table.getRow(0);
tableRowOne.getCell(0).setText("col one, row one");
tableRowOne.addNewTableCell().setText("col two, row one");
tableRowOne.addNewTableCell().setText("col three, row one");
//create second row
XWPFTableRow tableRowTwo = table.createRow();
tableRowTwo.getCell(0).setText("col one, row two");
tableRowTwo.getCell(1).setText("col two, row two");
tableRowTwo.getCell(2).setText("col three, row two");
//create third row
XWPFTableRow tableRowThree = table.createRow();
tableRowThree.getCell(0).setText("col one, row three");
tableRowThree.getCell(1).setText("col two, row three");
tableRowThree.getCell(2).setText("col three, row three");
PdfOptions options = PdfOptions.create();
PdfConverter.getInstance().convert(document, out, options);
out.close();
return out;
}
调用它的代码是:
public ResponseEntity<Resource> convertToPDFPost(@ApiParam(value = "DTOs passed from the FE" ,required=true ) @Valid @RequestBody ExportEnvelopeDTO exportDtos) {
if (exportDtos.getProdExportDTOs() != null) {
try {
FileOutputStream out = new FileOutputStream("/Users/kornhaus/Desktop/test.pdf");
out.write(exporter.export().toByteArray());
out.close();
} catch (IOException e) {
e.printStackTrace();
}
return new ResponseEntity<Resource>(responseFile, responseHeaders, HttpStatus.OK);
}
return new ResponseEntity<Resource>(HttpStatus.INTERNAL_SERVER_ERROR);
}
}
在这里的这一行:out.write(exporter.export().toByteArray());
代码抛出异常:
On this line here: out.write(exporter.export().toByteArray());
the code throws an exception:
org.apache.poi.xwpf.converter.core.XWPFConverterException: java.io.IOException: Unable to parse xml bean
我不知道是什么导致了这种情况,甚至不知道在哪里寻找此类文档.我已经编码了十多年,从来没有遇到过应该是一个简单的 Java 库的困难.任何帮助都会很棒.
I have no clue what's causing this, where to even look for this kind of documentation. I have been coding a decade plus and never had such difficulty with what should be a simple Java library. Any help would be great.
推荐答案
主要问题是那些 PdfOptions
和 PdfConverter
不是 的一部分apache poi
项目.它们由 opensagres
开发,第一个版本被错误地命名为 org.apache.poi.xwpf.converter.pdf.PdfOptions
和 org.apache.poi.xwpf.转换器.pdf.PdfConverter
.那些旧类自 2014 年以来没有更新,需要使用 apache poi
的 3.9
版本.
The main problem with this is that those PdfOptions
and PdfConverter
are not part of the apache poi
project. They are developed by opensagres
and first versions were badly named org.apache.poi.xwpf.converter.pdf.PdfOptions
and org.apache.poi.xwpf.converter.pdf.PdfConverter
. Those old classes were not updated since 2014 and needs version 3.9
of apache poi
to be used.
但同样的开发者提供了fr.opensagres.poi.xwpf.converter.pdf,它是最新的并且使用最新的稳定版本apache poi 3.17
.所以我们应该使用它.
But the same developers provide fr.opensagres.poi.xwpf.converter.pdf, which is much more current and works using the latest stable release apache poi 3.17
. So we should using this.
但是由于即使是那些较新的 PdfOptions
和 PdfConverter
也不是 apache poi
项目的一部分,apache poi
不会用他们的版本测试那些.因此,apache poi
创建的默认 *.docx
文档缺少一些 PdfConverter
需要的内容.
But since even those newer PdfOptions
and PdfConverter
are not part of the apache poi
project, apache poi
will not testing those with their releases. And so the default *.docx
documents created by apache poi
lacks some content which PdfConverter
needs.
必须有一个样式文档,即使它是空的.
There must be a styles document, even if it is empty.
页面必须有至少设置页面大小的部分属性.
There must be section properties for the page having at least the page size set.
表格必须有表格网格集.
Tables must have a table grid set.
为了实现这一点,我们必须在我们的程序中额外添加一些代码.不幸的是,这需要 ooxml-schemas-1.3.jar
中提到的所有模式的完整 jar,如 常见问题解答-N10025.
To fulfilling this we must add some code additionally in our program. Unfortunately this then needs the full jar of all of the schemas ooxml-schemas-1.3.jar
as mentioned in Faq-N10025.
而且由于我们需要更改底层对象,因此必须编写文档以便提交底层对象.否则我们交给 PdfConverter
的 XWPFDocument
将是不完整的.
And because we need changing the underlaying low level objects, the document must be written so underlaying objects will be committed. Else the XWPFDocument
which we hand over the PdfConverter
will be incomplete.
示例:
import java.io.*;
import java.math.BigInteger;
//needed jars: fr.opensagres.poi.xwpf.converter.core-2.0.1.jar,
// fr.opensagres.poi.xwpf.converter.pdf-2.0.1.jar,
// fr.opensagres.xdocreport.itext.extension-2.0.1.jar,
// itext-2.1.7.jar
import fr.opensagres.poi.xwpf.converter.pdf.PdfOptions;
import fr.opensagres.poi.xwpf.converter.pdf.PdfConverter;
//needed jars: apache poi and it's dependencies
// and additionally: ooxml-schemas-1.3.jar
import org.apache.poi.xwpf.usermodel.*;
import org.apache.poi.util.Units;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.*;
public class XWPFToPDFConverterSampleMin {
public static void main(String[] args) throws Exception {
XWPFDocument document = new XWPFDocument();
// there must be a styles document, even if it is empty
XWPFStyles styles = document.createStyles();
// there must be section properties for the page having at least the page size set
CTSectPr sectPr = document.getDocument().getBody().addNewSectPr();
CTPageSz pageSz = sectPr.addNewPgSz();
pageSz.setW(BigInteger.valueOf(12240)); //12240 Twips = 12240/20 = 612 pt = 612/72 = 8.5"
pageSz.setH(BigInteger.valueOf(15840)); //15840 Twips = 15840/20 = 792 pt = 792/72 = 11"
// filling the body
XWPFParagraph paragraph = document.createParagraph();
//create table
XWPFTable table = document.createTable();
//create first row
XWPFTableRow tableRowOne = table.getRow(0);
tableRowOne.getCell(0).setText("col one, row one");
tableRowOne.addNewTableCell().setText("col two, row one");
tableRowOne.addNewTableCell().setText("col three, row one");
//create CTTblGrid for this table with widths of the 3 columns.
//necessary for Libreoffice/Openoffice and PdfConverter to accept the column widths.
//values are in unit twentieths of a point (1/1440 of an inch)
//first column = 2 inches width
table.getCTTbl().addNewTblGrid().addNewGridCol().setW(BigInteger.valueOf(2*1440));
//other columns (2 in this case) also each 2 inches width
for (int col = 1 ; col < 3; col++) {
table.getCTTbl().getTblGrid().addNewGridCol().setW(BigInteger.valueOf(2*1440));
}
//create second row
XWPFTableRow tableRowTwo = table.createRow();
tableRowTwo.getCell(0).setText("col one, row two");
tableRowTwo.getCell(1).setText("col two, row two");
tableRowTwo.getCell(2).setText("col three, row two");
//create third row
XWPFTableRow tableRowThree = table.createRow();
tableRowThree.getCell(0).setText("col one, row three");
tableRowThree.getCell(1).setText("col two, row three");
tableRowThree.getCell(2).setText("col three, row three");
paragraph = document.createParagraph();
//trying picture
XWPFRun run = paragraph.createRun();
run.setText("The picture in line: ");
InputStream in = new FileInputStream("samplePict.jpeg");
run.addPicture(in, Document.PICTURE_TYPE_JPEG, "samplePict.jpeg", Units.toEMU(100), Units.toEMU(30));
in.close();
run.setText(" text after the picture.");
paragraph = document.createParagraph();
//document must be written so underlaaying objects will be committed
ByteArrayOutputStream out = new ByteArrayOutputStream();
document.write(out);
document.close();
document = new XWPFDocument(new ByteArrayInputStream(out.toByteArray()));
PdfOptions options = PdfOptions.create();
PdfConverter converter = (PdfConverter)PdfConverter.getInstance();
converter.convert(document, new FileOutputStream("XWPFToPDFConverterSampleMin.pdf"), options);
document.close();
}
}
使用 XDocReport
另一种方法是使用最新版本的 opensagres/xdocreport,如转换器仅使用 ConverterRegistry:
Another way would be using the newest version of opensagres/xdocreport as described in Converter only with ConverterRegistry:
import java.io.*;
import java.math.BigInteger;
//needed jars: xdocreport-2.0.1.jar,
// odfdom-java-0.8.7.jar,
// itext-2.1.7.jar
import fr.opensagres.xdocreport.converter.Options;
import fr.opensagres.xdocreport.converter.IConverter;
import fr.opensagres.xdocreport.converter.ConverterRegistry;
import fr.opensagres.xdocreport.converter.ConverterTypeTo;
import fr.opensagres.xdocreport.core.document.DocumentKind;
//needed jars: apache poi and it's dependencies
// and additionally: ooxml-schemas-1.3.jar
import org.apache.poi.xwpf.usermodel.*;
import org.apache.poi.util.Units;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.*;
public class XWPFToPDFXDocReport {
public static void main(String[] args) throws Exception {
XWPFDocument document = new XWPFDocument();
// there must be a styles document, even if it is empty
XWPFStyles styles = document.createStyles();
// there must be section properties for the page having at least the page size set
CTSectPr sectPr = document.getDocument().getBody().addNewSectPr();
CTPageSz pageSz = sectPr.addNewPgSz();
pageSz.setW(BigInteger.valueOf(12240)); //12240 Twips = 12240/20 = 612 pt = 612/72 = 8.5"
pageSz.setH(BigInteger.valueOf(15840)); //15840 Twips = 15840/20 = 792 pt = 792/72 = 11"
// filling the body
XWPFParagraph paragraph = document.createParagraph();
//create table
XWPFTable table = document.createTable();
//create first row
XWPFTableRow tableRowOne = table.getRow(0);
tableRowOne.getCell(0).setText("col one, row one");
tableRowOne.addNewTableCell().setText("col two, row one");
tableRowOne.addNewTableCell().setText("col three, row one");
//create CTTblGrid for this table with widths of the 3 columns.
//necessary for Libreoffice/Openoffice and PdfConverter to accept the column widths.
//values are in unit twentieths of a point (1/1440 of an inch)
//first column = 2 inches width
table.getCTTbl().addNewTblGrid().addNewGridCol().setW(BigInteger.valueOf(2*1440));
//other columns (2 in this case) also each 2 inches width
for (int col = 1 ; col < 3; col++) {
table.getCTTbl().getTblGrid().addNewGridCol().setW(BigInteger.valueOf(2*1440));
}
//create second row
XWPFTableRow tableRowTwo = table.createRow();
tableRowTwo.getCell(0).setText("col one, row two");
tableRowTwo.getCell(1).setText("col two, row two");
tableRowTwo.getCell(2).setText("col three, row two");
//create third row
XWPFTableRow tableRowThree = table.createRow();
tableRowThree.getCell(0).setText("col one, row three");
tableRowThree.getCell(1).setText("col two, row three");
tableRowThree.getCell(2).setText("col three, row three");
paragraph = document.createParagraph();
//trying picture
XWPFRun run = paragraph.createRun();
run.setText("The picture in line: ");
InputStream in = new FileInputStream("samplePict.jpeg");
run.addPicture(in, Document.PICTURE_TYPE_JPEG, "samplePict.jpeg", Units.toEMU(100), Units.toEMU(30));
in.close();
run.setText(" text after the picture.");
paragraph = document.createParagraph();
//document must be written so underlaaying objects will be committed
ByteArrayOutputStream out = new ByteArrayOutputStream();
document.write(out);
document.close();
// 1) Create options DOCX 2 PDF to select well converter form the registry
Options options = Options.getFrom(DocumentKind.DOCX).to(ConverterTypeTo.PDF);
// 2) Get the converter from the registry
IConverter converter = ConverterRegistry.getRegistry().getConverter(options);
// 3) Convert DOCX 2 PDF
InputStream docxin= new ByteArrayInputStream(out.toByteArray());
OutputStream pdfout = new FileOutputStream(new File("XWPFToPDFXDocReport.pdf"));
converter.convert(docxin, pdfout, options);
docxin.close();
pdfout.close();
}
}
2018 年 10 月:此代码使用 apache poi 3.17
工作.它无法使用 apache poi 4.0.0
工作,因为 apache poi
中的更改直到现在在 fr.opensagres.poi.xwpf.converter 中都没有考虑到
以及 fr.opensagres.xdocreport.converter
.
October 2018:
This code works using apache poi 3.17
. It cannot work using apache poi 4.0.0
due to changings in apache poi
which were not taken in account until now in fr.opensagres.poi.xwpf.converter
as well as in fr.opensagres.xdocreport.converter
.
2019 年 2 月:我现在使用最新的 apache poi
版本 4.0.1
和 fr.opensagres.poi.xwpf.converter.core 和配偶.
February 2019:
Works for me now using the newest apache poi
version 4.0.1
and the newest version 2.0.2
of fr.opensagres.poi.xwpf.converter.core and consorts.
2021 年 6 月:使用 apache poi
版本 4.1.2
和最新版本 2.0.2
的 fr.opensagres.poi.xwpf.converter.core 和配偶.无法使用 apache poi
版本 5.0.0
因为 XDocReport
需要 ooxml-schemas
而 apache poi 5
不再支持.
June 2021:
Works using apache poi
version 4.1.2
and the newest version 2.0.2
of fr.opensagres.poi.xwpf.converter.core and consorts.
Cannot work using apache poi
version 5.0.0
because XDocReport
needs ooxml-schemas
which apache poi 5
does not support anymore.
这篇关于尝试使用 Apache poi 制作简单的 PDF 文档的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!