带有嵌入 HTML 的 PDF 报告 [英] PDF report with embedded HTML

查看:20
本文介绍了带有嵌入 HTML 的 PDF 报告的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们有一个基于 Java 的系统,它从数据库中读取数据,将各个数据字段与预设的 XSL-FO 标签合并,并将结果转换为带有 PDFPDF>Apache FOP.

We have a Java-based system that reads data from a database, merges individual data fields with preset XSL-FO tags and converts the result to PDF with Apache FOP.

XSL-FO 格式中,它看起来像这样:

In XSL-FO format it looks like this:

<?xml version="1.0" encoding="utf-8" ?>
<!DOCTYPE Html [
<!ENTITY nbsp  "&#160;"> 
    <!-- all other entities -->
]>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:fo="http://www.w3.org/1999/XSL/Format">
    <xsl:output method="xml" indent="yes" />
    <xsl:template match="/">

        <fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format" xmlns:svg="http://www.w3.org/2000/svg" font-family="..." font-size="...">
            <fo:layout-master-set>          
                <fo:simple-page-master master-name="Letter Page" page-width="8.500in" page-height="11.000in">

                    <!-- appropriate settings -->

                </fo:simple-page-master>
            </fo:layout-master-set>
            <fo:page-sequence master-reference="Letter Page">

                <!-- some static content -->

            <fo:flow flow-name="xsl-region-body">
                    <fo:block>
                        <fo:table ...>
                            <fo:table-column ... />
                            <fo:table-body>
                                <fo:table-row>
                                    <fo:table-cell ...>
                                        <fo:block text-align="...">
                                            <fo:inline font-size="..." font-weight="...">
                                                <!-- Header / Title -->
                                            </fo:inline>
                                        </fo:block>
                                    </fo:table-cell>
                                </fo:table-row>
                            </fo:table-body>
                        </fo:table>
                    </fo:block>

                    <fo:block>

                        <fo:table ...>
                            <fo:table-column ... />
                            <fo:table-body> 
                                <fo:table-row>
                                    <fo:table-cell>
                                        <fo:block ...>
                                            <!-- Field A -->                                
                                        </fo:block>
                                    </fo:table-cell>
                                </fo:table-row>
                            </fo:table-body>
                        </fo:table>

                        <!-- Other fields in a very similar fashion as the above "Field A" -->

                    </fo:block>

                </fo:flow>      

            </fo:page-sequence>

        </fo:root>              

    </xsl:template>

</xsl:stylesheet>

现在我正在寻找一种方法来允许某些字段包含静态HTML 格式 内容.此内容将由我们支持 HTML 的编辑器生成(类似于 CLEditorCKEditor 等)或从外部粘贴.

Now I am looking for a way to allow some of the fields to contain static HTML-formatted content. This content will be generated by our HTML-enabled editor (something along the lines of CLEditor, CKEditor, etc.) or pasted from outside.

我的计划是按照配方来自这篇 JavaWorld 文章:

  • 使用 JTidy 将 HTML 格式的字符串转换为正确的 XHTML
  • 进一步修改 Antenna House 中的 xhtml2fo.xsl 以删除所有文档范围和页面范围的转换
  • 将此修改后的 XSLT 应用到我的 XHTML 字符串 (javax.xml.transform)
  • 使用 XPath (javax.xml.xpath) 提取根下的所有节点
  • 将结果直接提供给现有的 XSL-FO 文档
  • use JTidy to convert HTML-formatted string to proper XHTML
  • further modify xhtml2fo.xsl from Antenna House to remove all document-wide and page-wide transformations
  • apply this modified XSLT to my XHTML string (javax.xml.transform)
  • extract all the nodes under the root with XPath (javax.xml.xpath)
  • feed the result directly into existing XSL-FO document

我有一个此类代码的基本版本,但出现以下错误:

I have a bare-bone version of such code and got the following error:

(错误位置未知)org.apache.fop1.fo.ValidationException:{http://www.w3.org/1999/XSL/Format}table-body"不是一个合法的孩子fo:block"!(没有可用的上下文信息)

(Location of error unknown)org.apache.fop1.fo.ValidationException: "{http://www.w3.org/1999/XSL/Format}table-body" is not a valid child of "fo:block"! (No context info available)

我的问题:

  1. 解决此问题的方法是什么?
  2. <fo:block> 可以作为一个通用容器,内部嵌套其他对象(包括表格)吗?
  3. 这是解决任务的总体合理方法吗?
  1. What would be the way to troubleshoot this issue?
  2. Can <fo:block> serve as a generic container with other objects (including tables) nested inside?
  3. Is this an overall reasonable approach to solving the task?

如果有人已经去过那里",请分享您的经验.

If someone already "been there done that", please share your experience.

推荐答案

故障排除的最佳方法是使用验证查看器/编辑器来检查 XSL FO.许多(例如 oXygen)会在您打开 XSL FO 结构时向您显示错误,并且它们会描述问题(就像报告的错误一样).

The best way to troubleshoot is to use a validating viewer/editor to examine the XSL FO. Many (such as oXygen) will show you errors in XSL FO structure as you open them and they will describe the issue (just as the error reported).

在你的情况下,你显然有一个 fo:table-body 作为 fo:block 的孩子.它不可能是.一个 fo:table-body 只有一个有效的父对象,fo:table.您要么缺少 fo:table 标签,要么在此位置错误地插入了 fo:block.

In your case, you obviously have an fo:table-body as a child of fo:block. It cannot be. An fo:table-body have but one valid parent, fo:table. You are either missing the fo:table tag or you have erroneously inserted an fo:block in this position.

在我看来,我可能会做一些稍微不同的事情.我会将 XHTML 内容内嵌到 XSL FO 中您需要的位置.然后我将创建一个身份转换,它复制所有基于 fo 的内容,但使用 XSL 转换 XHTML 部分.通过这种方式,您实际上可以在像 oXygen 这样的 XSL 编辑器中逐步执行该转换,并查看发生错误的位置以及确切原因.像任何其他调试器一样.

In my opinion, I might do things slightly different. I would put the XHTML content inline into the XSL FO right where you want it. Then I would create an identity transform that copies over all the content that is fo-based, but converts the XHTML parts using XSL. This way, you can actually step that transform in an XSL editor like oXygen and see where errors occur and exactly why. Like any other degugger.

注意:您可能还希望查看其他 XSL,尤其是当您的 HTML 可能具有任何 style="" CSS 属性时.如果是这种情况,它不是简单的 HTML,那么您将需要一种更好的方法来将带有 CSS 的 HTML 处理为 FO.

Note: You may wish to look at other XSLs also, especially if your HTML may have any style="" CSS attributes. If this is the case it is not simple HTML, then you will need a better method for processing the HTML with CSS to FO.

http://www.cloudformatter.com/css2pdf 基于这个完整的转换.该通用样式表可在此处获得:http://xep.cloudformatter.com/doc/XSL/xeponline-fo-translate-2.xsl

http://www.cloudformatter.com/css2pdf is based on this complete transform. That general stylesheet is available here: http://xep.cloudformatter.com/doc/XSL/xeponline-fo-translate-2.xsl

我是那个样式表的作者.它的作用比您要求的要多得多,但具有相当复杂的解析递归,用于将 CSS 样式转换为 XSL FO 属性.

I am the author of that stylesheet. It does much more than you ask, but has a fairly complex parsing recursion for converting CSS styling into XSL FO attributes.

这篇关于带有嵌入 HTML 的 PDF 报告的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆