带嵌入式HTML的PDF报告 [英] PDF report with embedded HTML

查看:103
本文介绍了带嵌入式HTML的PDF报告的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们有一个基于Java的系统从数据库读取数据,将各个数据字段与预设的 XSL-FO 标签合并,并将结果转换为 PDF Apache FOP



XSL- FO 格式如下:

 <?xml version =1.0encoding = utf-8?>> 
<!DOCTYPE Html [
<!ENTITY nbsp&#160;>>
<! - 所有其他实体 - >
]>
< xsl:output method =xmlindent =yes/>
< xsl:template match =/>

< fo:root xmlns:fo =http://www.w3.org/1999/XSL/Formatxmlns:svg =http://www.w3.org/2000 / svgfont-family =...font-size =...>
< fo:layout-master-set>

<! - 适当的设置 - >

< / fo:simple-page-master>
< / fo:layout-master-set>
< fo:page-sequence master-reference =Letter Page>

<! - 一些静态内容 - >

< fo:flow flow-name =xsl-region-body>
< fo:block>
< fo:table ...>
< fo:table-column ... />
< fo:table-body>
< fo:table-row>
< fo:table-cell ...>
< fo:block text-align =...>
< fo:inline font-size =...font-weight =...>
<! - 标题/标题 - >
< / fo:inline>
< / fo:block>
< / fo:table-cell>
< / fo:table-row>
< / fo:table-body>
< / fo:table>
< / fo:block>

< fo:block>

< fo:table ...>
< fo:table-column ... />
< fo:table-body>
< fo:table-row>
< fo:table-cell>
< fo:block ...>
<! - 字段A - >
< / fo:block>
< / fo:table-cell>
< / fo:table-row>
< / fo:table-body>
< / fo:table>

<! - 与上述字段​​A非常类似的其他字段 - >

< / fo:block>

< / fo:flow>

< / fo:page-sequence>

< / fo:root>

< / xsl:template>

< / xsl:stylesheet>

现在我正在寻找一种方法来允许一些字段包含静态 HTML-格式化内容。这些内容将由我们支持HTML的编辑器生成(如 CLEditor CKEditor 等等)或从外面粘贴。



我的计划是遵循配方



我有这样的代码的裸机版本,并且出现以下错误:


(错误的位置未知)org.apache.fop1.fo.ValidationException:
{http://www.w3.org/1999/XSL/Format } table-body不是fo:block的有效子元素
! (没有上下文信息)


我的问题:


  1. 解决此问题的方法是什么?

  2. 可以将< fo:block> 作为通用容器与其他对象(包括表)嵌套在内部?

  3. 这是一种合理的方法来解决任务吗?

如果某人已经在那里完成了这项工作,请分享您的经验。

解决方案

解决问题的最佳方法是使用验证查看器/编辑器来检查XSL FO。许多(如oXygen)会在您打开它们时向您显示XSL FO结构中的错误,它们将描述问题(就像报告的错误一样)。

在你的情况中,你显然有一个fo:table-body作为fo:block的子元素。它不可能是。 fo:table-body只有一个有效的父级,fo:table。您要么缺少fo:table标签,要么错误地在此位置插入了fo:block。

在我看来,我可能做的事情略有不同。我会将XHTML内容内嵌到XSL FO中,并放在你想要的地方。然后,我将创建一个身份转换,它复制所有基于fo的内容,但使用XSL转换XHTML部件。通过这种方式,您实际上可以在像oXygen这样的XSL编辑器中执行该转换,并查看发生错误的位置以及确切原因。像任何其他的消费者一样。

注意:你也可以看看其他的XSL,特别是如果你的HTML可能有任何style =CSS属性。如果是这种情况,它不是简单的HTML,那么您将需要一个更好的方法用CSS处理HTML到FO。



http://www.cloudformatter.com/css2pdf 是基于这个完整的转换。该通用样式表可在此处找到: http://xep.cloudformatter。 com / doc / XSL / xeponline-fo-translate-2.xsl



我是该样式表的作者。它的功能远远超出了你的要求,但它具有相当复杂的解析递归功能,可以将CSS样式转换为XSL FO属性。


We have a Java-based system that reads data from a database, merges individual data fields with preset XSL-FO tags and converts the result to PDF with Apache FOP.

In XSL-FO format it looks like this:

<?xml version="1.0" encoding="utf-8" ?>
<!DOCTYPE Html [
<!ENTITY nbsp  "&#160;"> 
    <!-- all other entities -->
]>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:fo="http://www.w3.org/1999/XSL/Format">
    <xsl:output method="xml" indent="yes" />
    <xsl:template match="/">

        <fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format" xmlns:svg="http://www.w3.org/2000/svg" font-family="..." font-size="...">
            <fo:layout-master-set>          
                <fo:simple-page-master master-name="Letter Page" page-width="8.500in" page-height="11.000in">

                    <!-- appropriate settings -->

                </fo:simple-page-master>
            </fo:layout-master-set>
            <fo:page-sequence master-reference="Letter Page">

                <!-- some static content -->

            <fo:flow flow-name="xsl-region-body">
                    <fo:block>
                        <fo:table ...>
                            <fo:table-column ... />
                            <fo:table-body>
                                <fo:table-row>
                                    <fo:table-cell ...>
                                        <fo:block text-align="...">
                                            <fo:inline font-size="..." font-weight="...">
                                                <!-- Header / Title -->
                                            </fo:inline>
                                        </fo:block>
                                    </fo:table-cell>
                                </fo:table-row>
                            </fo:table-body>
                        </fo:table>
                    </fo:block>

                    <fo:block>

                        <fo:table ...>
                            <fo:table-column ... />
                            <fo:table-body> 
                                <fo:table-row>
                                    <fo:table-cell>
                                        <fo:block ...>
                                            <!-- Field A -->                                
                                        </fo:block>
                                    </fo:table-cell>
                                </fo:table-row>
                            </fo:table-body>
                        </fo:table>

                        <!-- Other fields in a very similar fashion as the above "Field A" -->

                    </fo:block>

                </fo:flow>      

            </fo:page-sequence>

        </fo:root>              

    </xsl:template>

</xsl:stylesheet>

Now I am looking for a way to allow some of the fields to contain static HTML-formatted content. This content will be generated by our HTML-enabled editor (something along the lines of CLEditor, CKEditor, etc.) or pasted from outside.

My plan is to follow the recipe from this JavaWorld article:

  • use JTidy to convert HTML-formatted string to proper XHTML
  • further modify xhtml2fo.xsl from Antenna House to remove all document-wide and page-wide transformations
  • apply this modified XSLT to my XHTML string (javax.xml.transform)
  • extract all the nodes under the root with XPath (javax.xml.xpath)
  • feed the result directly into existing XSL-FO document

I have a bare-bone version of such code and got the following error:

(Location of error unknown)org.apache.fop1.fo.ValidationException: "{http://www.w3.org/1999/XSL/Format}table-body" is not a valid child of "fo:block"! (No context info available)

My questions:

  1. What would be the way to troubleshoot this issue?
  2. Can <fo:block> serve as a generic container with other objects (including tables) nested inside?
  3. Is this an overall reasonable approach to solving the task?

If someone already "been there done that", please share your experience.

解决方案

The best way to troubleshoot is to use a validating viewer/editor to examine the XSL FO. Many (such as oXygen) will show you errors in XSL FO structure as you open them and they will describe the issue (just as the error reported).

In your case, you obviously have an fo:table-body as a child of fo:block. It cannot be. An fo:table-body have but one valid parent, fo:table. You are either missing the fo:table tag or you have erroneously inserted an fo:block in this position.

In my opinion, I might do things slightly different. I would put the XHTML content inline into the XSL FO right where you want it. Then I would create an identity transform that copies over all the content that is fo-based, but converts the XHTML parts using XSL. This way, you can actually step that transform in an XSL editor like oXygen and see where errors occur and exactly why. Like any other degugger.

Note: You may wish to look at other XSLs also, especially if your HTML may have any style="" CSS attributes. If this is the case it is not simple HTML, then you will need a better method for processing the HTML with CSS to FO.

http://www.cloudformatter.com/css2pdf is based on this complete transform. That general stylesheet is available here: http://xep.cloudformatter.com/doc/XSL/xeponline-fo-translate-2.xsl

I am the author of that stylesheet. It does much more than you ask, but has a fairly complex parsing recursion for converting CSS styling into XSL FO attributes.

这篇关于带嵌入式HTML的PDF报告的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆