尝试使用Apache poi制作简单的PDF文档 [英] Trying to make simple PDF document with Apache poi

查看:3497
本文介绍了尝试使用Apache poi制作简单的PDF文档的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我看到互联网上充斥着人们抱怨apache的pdf产品,但我找不到我的特殊用法。我正在尝试用apache poi做一个简单的Hello World。现在我的代码如下:

I see the internet is riddled with people complaining about apache's pdf products, but I cannot find my particular usecase here. I am trying to do a simple Hello World with apache poi. Right now my code is as follows:

public ByteArrayOutputStream export() throws IOException {
    //Blank Document
    XWPFDocument document = new XWPFDocument();

    //Write the Document in file system
    ByteArrayOutputStream out = new ByteArrayOutputStream();;

    //create table
    XWPFTable table = document.createTable();
    XWPFStyles styles = document.createStyles();
    styles.setSpellingLanguage("English");
    //create first row
    XWPFTableRow tableRowOne = table.getRow(0);
    tableRowOne.getCell(0).setText("col one, row one");
    tableRowOne.addNewTableCell().setText("col two, row one");
    tableRowOne.addNewTableCell().setText("col three, row one");

    //create second row
    XWPFTableRow tableRowTwo = table.createRow();
    tableRowTwo.getCell(0).setText("col one, row two");
    tableRowTwo.getCell(1).setText("col two, row two");
    tableRowTwo.getCell(2).setText("col three, row two");

    //create third row
    XWPFTableRow tableRowThree = table.createRow();
    tableRowThree.getCell(0).setText("col one, row three");
    tableRowThree.getCell(1).setText("col two, row three");
    tableRowThree.getCell(2).setText("col three, row three");

    PdfOptions options = PdfOptions.create();
    PdfConverter.getInstance().convert(document, out, options);
    out.close();
    return out;
}

,调用此代码的代码是:

and the code that calls this is:

    public ResponseEntity<Resource> convertToPDFPost(@ApiParam(value = "DTOs passed from the FE" ,required=true )  @Valid @RequestBody ExportEnvelopeDTO exportDtos) {

        if (exportDtos.getProdExportDTOs() != null) {
            try {
                FileOutputStream out = new FileOutputStream("/Users/kornhaus/Desktop/test.pdf");
                out.write(exporter.export().toByteArray());
                out.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
            return new ResponseEntity<Resource>(responseFile, responseHeaders, HttpStatus.OK);
        }

        return new ResponseEntity<Resource>(HttpStatus.INTERNAL_SERVER_ERROR);
    }

}

这一行在这里: out.write(exporter.export()toByteArray());
代码抛出异常:

On this line here: out.write(exporter.export().toByteArray()); the code throws an exception:

org.apache.poi.xwpf.converter.core.XWPFConverterException: java.io.IOException: Unable to parse xml bean

我不知道是什么造成这种情况,甚至在哪里寻找这种文件。我已经编写了十年以上的编码,从来没有遇到过应该是一个简单的Java库的困难。任何帮助都会很棒。

I have no clue what's causing this, where to even look for this kind of documentation. I have been coding a decade plus and never had such difficulty with what should be a simple Java library. Any help would be great.

推荐答案

这个问题的主要问题是那些 PdfOptions PdfConverter 不属于 apache poi 项目。它们由 opensagres 开发,第一个版本命名为 org.apache.poi.xwpf.converter.pdf.PdfOptions org.apache.poi.xwpf.converter.pdf.PdfConverter 。这些旧课程自2014年以来未更新,需要使用 3.9 apache poi 版本。

The main problem with this is that those PdfOptions and PdfConverter are not part of the apache poi project. They are developed by opensagres and first versions were badly named org.apache.poi.xwpf.converter.pdf.PdfOptions and org.apache.poi.xwpf.converter.pdf.PdfConverter. Those old classes were not updated since 2014 and needs version 3.9 of apache poi to be used.

但是同样的开发人员提供了 fr.opensagres.poi.xwpf.converter.pdf ,这是更新的,使用最新的稳定版本 apache poi 3.17 。所以我们应该使用它。

But the same developers provide fr.opensagres.poi.xwpf.converter.pdf, which is much more current and works using the latest stable release apache poi 3.17. So we should using this.

但是,即使是那些较新的 PdfOptions PdfConverter 不是 apache poi 项目的一部分, apache poi 不会测试那些发布它们的人。因此, apache poi 创建的默认 *。docx 文档缺少一些 PdfConverter <的内容/ code>需要。

But since even those newer PdfOptions and PdfConverter are not part of the apache poi project, apache poi will not testing those with their releases. And so the default *.docx documents created by apache poi lacks some content which PdfConverter needs.


  1. 必须有样式文档,即使它是空的。

  1. There must be a styles document, even if it is empty.

必须有至少设置了页面大小的页面的节属性。

There must be section properties for the page having at least the page size set.

表必须设置表格。

为了实现这一点,我们必须在程序中另外添加一些代码。不幸的是,这需要所有模式 ooxml-schemas-1.3.jar 的完整jar,如 Faq-N10025

To fulfilling this we must add some code additionally in our program. Unfortunately this then needs the full jar of all of the schemas ooxml-schemas-1.3.jar as mentioned in Faq-N10025.

因为我们需要更改底层的低级别对象,必须编写文档,以便提交底层对象。否则我们交出的 XWPFDocument 将不完整。

And because we need changing the underlaying low level objects, the document must be written so underlaying objects will be committed. Else the XWPFDocument which we hand over the PdfConverter will be incomplete.

示例:

import java.io.*;
import java.math.BigInteger;

//needed jars: fr.opensagres.poi.xwpf.converter.core-2.0.1.jar, 
//             fr.opensagres.poi.xwpf.converter.pdf-2.0.1.jar,
//             fr.opensagres.xdocreport.itext.extension-2.0.1.jar,
//             itext-2.1.7.jar                                  
import fr.opensagres.poi.xwpf.converter.pdf.PdfOptions;
import fr.opensagres.poi.xwpf.converter.pdf.PdfConverter;

//needed jars: apache poi and it's dependencies
//             and additionally: ooxml-schemas-1.3.jar 
import org.apache.poi.xwpf.usermodel.*;
import org.apache.poi.util.Units;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.*;

public class XWPFToPDFConverterSampleMin {

 public static void main(String[] args) throws Exception {

  XWPFDocument document = new XWPFDocument();

  // there must be a styles document, even if it is empty
  XWPFStyles styles = document.createStyles();

  // there must be section properties for the page having at least the page size set
  CTSectPr sectPr = document.getDocument().getBody().addNewSectPr();
  CTPageSz pageSz = sectPr.addNewPgSz();
  pageSz.setW(BigInteger.valueOf(12240)); //12240 Twips = 12240/20 = 612 pt = 612/72 = 8.5"
  pageSz.setH(BigInteger.valueOf(15840)); //15840 Twips = 15840/20 = 792 pt = 792/72 = 11"

  // filling the body
  XWPFParagraph paragraph = document.createParagraph();

  //create table
  XWPFTable table = document.createTable();

  //create first row
  XWPFTableRow tableRowOne = table.getRow(0);
  tableRowOne.getCell(0).setText("col one, row one");
  tableRowOne.addNewTableCell().setText("col two, row one");
  tableRowOne.addNewTableCell().setText("col three, row one");

  //create CTTblGrid for this table with widths of the 3 columns. 
  //necessary for Libreoffice/Openoffice and PdfConverter to accept the column widths.
  //values are in unit twentieths of a point (1/1440 of an inch)
  //first column = 2 inches width
  table.getCTTbl().addNewTblGrid().addNewGridCol().setW(BigInteger.valueOf(2*1440));
  //other columns (2 in this case) also each 2 inches width
  for (int col = 1 ; col < 3; col++) {
   table.getCTTbl().getTblGrid().addNewGridCol().setW(BigInteger.valueOf(2*1440));
  }

  //create second row
  XWPFTableRow tableRowTwo = table.createRow();
  tableRowTwo.getCell(0).setText("col one, row two");
  tableRowTwo.getCell(1).setText("col two, row two");
  tableRowTwo.getCell(2).setText("col three, row two");

  //create third row
  XWPFTableRow tableRowThree = table.createRow();
  tableRowThree.getCell(0).setText("col one, row three");
  tableRowThree.getCell(1).setText("col two, row three");
  tableRowThree.getCell(2).setText("col three, row three");

  paragraph = document.createParagraph();

  //trying picture
  XWPFRun run = paragraph.createRun();
  run.setText("The picture in line: ");
  InputStream in = new FileInputStream("samplePict.jpeg");
  run.addPicture(in, Document.PICTURE_TYPE_JPEG, "samplePict.jpeg", Units.toEMU(100), Units.toEMU(30));
  in.close();  
  run.setText(" text after the picture.");

  paragraph = document.createParagraph();

  //document must be written so underlaaying objects will be committed
  ByteArrayOutputStream out = new ByteArrayOutputStream();
  document.write(out);
  document.close();

  document = new XWPFDocument(new ByteArrayInputStream(out.toByteArray()));
  PdfOptions options = PdfOptions.create();
  PdfConverter converter = (PdfConverter)PdfConverter.getInstance();
  converter.convert(document, new FileOutputStream("XWPFToPDFConverterSampleMin.pdf"), options);

  document.close();

 }
}






使用XDocReport

另一种方法是使用 opensagres / xdocreport ,如仅使用ConverterRegistry进行转换

Another way would be using the newest version of opensagres/xdocreport as described in Converter only with ConverterRegistry:

import java.io.*;
import java.math.BigInteger;

//needed jars: xdocreport-2.0.1.jar, 
//             odfdom-java-0.8.7.jar,
//             itext-2.1.7.jar  
import fr.opensagres.xdocreport.converter.Options;
import fr.opensagres.xdocreport.converter.IConverter;
import fr.opensagres.xdocreport.converter.ConverterRegistry;
import fr.opensagres.xdocreport.converter.ConverterTypeTo;
import fr.opensagres.xdocreport.core.document.DocumentKind;

//needed jars: apache poi and it's dependencies
//             and additionally: ooxml-schemas-1.3.jar 
import org.apache.poi.xwpf.usermodel.*;
import org.apache.poi.util.Units;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.*;

public class XWPFToPDFXDocReport {

 public static void main(String[] args) throws Exception {

  XWPFDocument document = new XWPFDocument();

  // there must be a styles document, even if it is empty
  XWPFStyles styles = document.createStyles();

  // there must be section properties for the page having at least the page size set
  CTSectPr sectPr = document.getDocument().getBody().addNewSectPr();
  CTPageSz pageSz = sectPr.addNewPgSz();
  pageSz.setW(BigInteger.valueOf(12240)); //12240 Twips = 12240/20 = 612 pt = 612/72 = 8.5"
  pageSz.setH(BigInteger.valueOf(15840)); //15840 Twips = 15840/20 = 792 pt = 792/72 = 11"

  // filling the body
  XWPFParagraph paragraph = document.createParagraph();

  //create table
  XWPFTable table = document.createTable();

  //create first row
  XWPFTableRow tableRowOne = table.getRow(0);
  tableRowOne.getCell(0).setText("col one, row one");
  tableRowOne.addNewTableCell().setText("col two, row one");
  tableRowOne.addNewTableCell().setText("col three, row one");

  //create CTTblGrid for this table with widths of the 3 columns. 
  //necessary for Libreoffice/Openoffice and PdfConverter to accept the column widths.
  //values are in unit twentieths of a point (1/1440 of an inch)
  //first column = 2 inches width
  table.getCTTbl().addNewTblGrid().addNewGridCol().setW(BigInteger.valueOf(2*1440));
  //other columns (2 in this case) also each 2 inches width
  for (int col = 1 ; col < 3; col++) {
   table.getCTTbl().getTblGrid().addNewGridCol().setW(BigInteger.valueOf(2*1440));
  }

  //create second row
  XWPFTableRow tableRowTwo = table.createRow();
  tableRowTwo.getCell(0).setText("col one, row two");
  tableRowTwo.getCell(1).setText("col two, row two");
  tableRowTwo.getCell(2).setText("col three, row two");

  //create third row
  XWPFTableRow tableRowThree = table.createRow();
  tableRowThree.getCell(0).setText("col one, row three");
  tableRowThree.getCell(1).setText("col two, row three");
  tableRowThree.getCell(2).setText("col three, row three");

  paragraph = document.createParagraph();

  //trying picture
  XWPFRun run = paragraph.createRun();
  run.setText("The picture in line: ");
  InputStream in = new FileInputStream("samplePict.jpeg");
  run.addPicture(in, Document.PICTURE_TYPE_JPEG, "samplePict.jpeg", Units.toEMU(100), Units.toEMU(30));
  in.close();  
  run.setText(" text after the picture.");

  paragraph = document.createParagraph();

  //document must be written so underlaaying objects will be committed
  ByteArrayOutputStream out = new ByteArrayOutputStream();
  document.write(out);
  document.close();

  // 1) Create options DOCX 2 PDF to select well converter form the registry
  Options options = Options.getFrom(DocumentKind.DOCX).to(ConverterTypeTo.PDF);

  // 2) Get the converter from the registry
  IConverter converter = ConverterRegistry.getRegistry().getConverter(options);

  // 3) Convert DOCX 2 PDF
  InputStream docxin= new ByteArrayInputStream(out.toByteArray());
  OutputStream pdfout = new FileOutputStream(new File("XWPFToPDFXDocReport.pdf"));
  converter.convert(docxin, pdfout, options);

  docxin.close();       
  pdfout.close();       

 }
}






2018年10月:
此代码使用 apache poi 3.17 。由于 apache poi 中的变化,它无法使用 apache poi 4.0.0 ,直到现在才被用于帐户 fr.opensagres.poi.xwpf.converter 以及 fr.opensagres.xdocreport.converter


October 2018: This code works using apache poi 3.17. It cannot work using apache poi 4.0.0 due to changings in apache poi which were not taken in account until now in fr.opensagres.poi.xwpf.converter as well as in fr.opensagres.xdocreport.converter.

这篇关于尝试使用Apache poi制作简单的PDF文档的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆