将具有相同名称和相同属性的以下兄弟姐妹分组会导致撒克逊人出现异常 [英] grouping following-siblings with same name and same attributes causes exception in saxon

查看:13
本文介绍了将具有相同名称和相同属性的以下兄弟姐妹分组会导致撒克逊人出现异常的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一些 xml 文档(类似于 docbook)必须转换为 xsl-fo.一些文档包含诗,诗的行写在单独的 p 标签中.诗句由 br 标签分隔.有一些不相关的页面"标签应该被忽略.

典型代码示例:

Headline

<p>第一节的第一行</p><p>第一节的第二行</p><br/><p>第二节的第一行</p><p>第二节的第二行</p><page n="100"/><p>第二节第三行</p><h4>其他标题</h4>

对于 xsl-fo 输出,我想将一节经文的所有文本收集到一个 fo:block 中.现在该机制适用于上述代码结构,但也有一些例外.实际的做法是决定每个 p 标签:- 我是诗歌的第一行吗?- 如果是:收集这节经文的所有文本并将其写入一个 fo:block,使用实际(第一个)p 标签的属性来设置块的格式- 如果不是:内容被处理得更早,什么都不做.

第一行是一个 p 标签,紧跟在 h4 或 br 标签(或一个页面标签,其本身紧跟在 br 标签之前).那个很容易开发.

对于给定的示例,收集一节经文很容易:将所有以下兄弟姐妹分组,定义组以 h4 或 br 标签结尾,然后我取第一组并使用所有 p 标签(忽略页面标签之间或结束 h4 或 br 标签).

在代码中:

<xsl:if test="position()=1"><xsl:for-each select="current-group()[not(self::h4) and not(self::br) and not(self::page)]"><xsl:apply-templates/>&crt;</xsl:for-each></xsl:if></xsl:for-each-group>

现在来看一个类似的代码示例:

Headline

<p class="center">1</p><p>第一节的第一行</p><p>第一节的第二行</p><br/><p class="center">2</p><p>第二节的第一行</p><p>第二节的第二行</p><page n="100"/><p>第二节第三行</p><h4>其他标题</h4>

现在居中的 p 就像是以下经文的副标题.它并不是真正的诗句,但就我而言,如果将其与真正的诗句文本分开就足够了.因此,获取当前经文的所有文本略有不同的规则是:将所有以下兄弟姐妹分组,以 h4 或 br 标签或通过具有另一个类然后是当前 p 标签的 ap 标签 定义组,然后我取第一组并使用所有 p 标签(忽略中间页标签或结束 h4 或 br 标签).

因此我将当前 p 标签的 class 属性的值存储在一个名为 attributes 的变量中,并将分组规则定义为:

<xsl:for-each-group select="following-sibling::*" group-ending-with="br|h4|p[normalize-space(@class) != $attributes]>

反过来,当试图确定一个 p 标签是否是一节经文的第一行时,它前面不仅不能是 h4 或 br,还可以是另一个具有不同类属性值的 p 标签.

现在这在我使用 Saxon-B9.1.0.6 的 Oxygen 测试环境中运行良好.但是转换必须在 java 中使用 Saxon9.jar 执行,并且在 xsl:for-each-group 的 group-ending-with 属性中使用变量会导致异常.

现在我有点卡住了.

能否以更好的方式定义分组条件?或者这可能根本不应该用分组来完成,而是用一种完全不同的方法?

源文件是原样,标记可能不是最佳的,但它是原样.这种转变并不新鲜,但后来适应了我们的需求.带有诗的源代码之前只是简单地避免了,但我想为此找到解决方案.

任何帮助将不胜感激.

最好的问候,

克里斯蒂安·基尔霍夫

解决方案

此样式表:

<xsl:template match="div[@class='poem']"><xsl:copy><xsl:copy-of select="@*"/><xsl:for-each-group select="*" group-ending-with="br|h4"><div class="strophe"><xsl:copy-of select="current-group()/self::p[not(@class)]"/>

</xsl:for-each-group></xsl:copy></xsl:模板></xsl:stylesheet>

使用此输入:

<h4>标题</h4><p>第一节的第一行</p><p>第一节的第二行</p><br/><p>第二节的第一行</p><p>第二节的第二行</p><page n="100"/><p>第二节第三行</p>

输出:

<div class="strophe"><p>第一节的第一行</p><p>第一节的第二行</p>

<div class="strophe"><p>第二节的第一行</p><p>第二节的第二行</p><p>第二节第三行</p>

使用此输入:

<h4>标题</h4><p class="center">1</p><p>第一节的第一行</p><p>第一节的第二行</p><br/><p class="center">2</p><p>第二节的第一行</p><p>第二节的第二行</p><page n="100"/><p>第二节第三行</p>

输出:

<div class="strophe"><p>第一节的第一行</p><p>第一节的第二行</p>

<div class="strophe"><p>第二节的第一行</p><p>第二节的第二行</p><p>第二节第三行</p>

所以,这个样式表:

<xsl:template match="div[@class='poems']"><xsl:copy><xsl:copy-of select="@*"/><xsl:for-each-group select="*[preceding-sibling::h4]"group-starting-with="h4"><div class="诗"><xsl:for-each-group select="current-group()"group-ending-with="br"><div class="strophe"><xsl:copy-of select="current-group()/self::p[not(@class)]"/>

</xsl:for-each-group>

</xsl:for-each-group></xsl:copy></xsl:模板></xsl:stylesheet>

使用此输入:

<h3>诗</h3><h4>标题</h4><p>第一节的第一行</p><p>第一节的第二行</p><br/><p>第二节的第一行</p><p>第二节的第二行</p><page n="100"/><p>第二节第三行</p><h4>标题</h4><p class="center">1</p><p>第一节的第一行</p><p>第一节的第二行</p><br/><p class="center">2</p><p>第二节的第一行</p><p>第二节的第二行</p><page n="100"/><p>第二节第三行</p>

输出:

<div class="诗"><div class="strophe"><p>第一节的第一行</p><p>第一节的第二行</p>

<div class="strophe"><p>第二节的第一行</p><p>第二节的第二行</p><p>第二节第三行</p>

<div class="诗"><div class="strophe"><p>第一节的第一行</p><p>第一节的第二行</p>

<div class="strophe"><p>第二节的第一行</p><p>第二节的第二行</p><p>第二节第三行</p>

I have some xml documents (similar to docbook) that have to be transformed to xsl-fo. Some of the documents contains poems, and the lines of the poems are written in separate p tags. The verses are separated by br tags. There are "page" tags that are irrelevant and should be ignored.

Typical code example:

<h4>Headline</h4>
<p>1st line of 1st verse</p>
<p>2nd line of 1st verse</p>
<br/>
<p>1st line of 2nd verse</p>
<p>2nd line of 2nd verse</p>
<page n="100"/>
<p>3rd line of 2nd verse</p>
<h4>Other headline</h4>

For the xsl-fo output, I would like to gather all the text of a verse into one single fo:block. Right now the mechanism works for code structures as above, but there are some exceptions. The actual way of doing it is to decide for every p tag: - Am I the first line of a verse? - If yes: collect all the text of this verse ynd write it into a fo:block, use the attributes of the actual (first) p tag to set the formatting of the block - If no: contents were treated ealrier, do nothing.

A first line is a p tag that is immediately preceded by a h4 or a br tag (or a page tag which itself is immediately preceded by a br tag). That one was easy to develop.

Collecting the text of a verse was easy for the given example: Group all following siblings, defining the groups ends by h4 or br tags, then I take the first group and use all p tags (ignore in between page tags or the ending h4 or br tag).

In code:

<xsl:for-each-group select="following-sibling::*" group-ending-with="br|h4">
    <xsl:if test="position()=1">
        <xsl:for-each select="current-group()[not(self::h4) and not(self::br) and not(self::page)]">
            <xsl:apply-templates/>&crt;
        </xsl:for-each>
    </xsl:if>
</xsl:for-each-group>

Now to a similar code example:

<h4>Headline</h4>
<p class="center">1</p>
<p>1st line of 1st verse</p>
<p>2nd line of 1st verse</p>
<br/>
<p class="center">2</p>
<p>1st line of 2nd verse</p>
<p>2nd line of 2nd verse</p>
<page n="100"/>
<p>3rd line of 2nd verse</p>
<h4>Other headline</h4>

Now the centered p are like a subheadlines to the following verses. It is not really a verse, but for my purposes it would be enough if it would be separated from the real verse's text. Thus the slightly varied rule for getting all the text of the current verse is: Group all following siblings, defining the groups ends by h4 or br tags or by a p tag that has another class then the current p tag , then I take the first group and use all p tags (ignore in between page tags or the ending h4 or br tag).

Therefore I stored the value of the class attribute of the current p tag in a variable called attributes and defined the the group rule as:

<xsl:for-each-group select="following-sibling::*" group-ending-with="br|h4|p[normalize-space(@class) != $attributes]">

In eturn, when trying to determine if a p tag is the first line of a verse, it cannot only be preceded by a h4 or br, but also by another p tag that has a different class attribute value.

Now this works fine in my testing environment in Oxygen using Saxon-B9.1.0.6. But the transformation has to be performed in java using Saxon9.jar, and there the usage of a variable inside the group-ending-with attribute of the xsl:for-each-group causes an exception.

And now I am kind of stuck.

COuld the grouping conditions be defined in a better way? Or should this maybe not be done with grouping at all, but with a totally different approach?

The source files are as they are, the tagging might not be optimal, but it is as it is. The transformation is not new but was subsequently adapted to our needs. Source code with poems in it was simply avoided earlier, but I'd like to find a solution for this.

Any help would be greatly appreciated.

Best regards,

Christian Kirchhoff

解决方案

This stylesheet:

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:template match="div[@class='poem']">
        <xsl:copy>
            <xsl:copy-of select="@*"/>
            <xsl:for-each-group select="*" group-ending-with="br|h4">
                <div class="strophe">
                    <xsl:copy-of select="current-group()/self::p[not(@class)]"/>
                </div>
            </xsl:for-each-group>
        </xsl:copy>
    </xsl:template>
</xsl:stylesheet>

With this input:

<div class="poem">
    <h4>Headline</h4>
    <p>1st line of 1st verse</p>
    <p>2nd line of 1st verse</p>
    <br/>
    <p>1st line of 2nd verse</p>
    <p>2nd line of 2nd verse</p>
    <page n="100"/>
    <p>3rd line of 2nd verse</p>
</div>

Output:

<div class="poem">
    <div class="strophe">
        <p>1st line of 1st verse</p>
        <p>2nd line of 1st verse</p>
    </div>
    <div class="strophe">
        <p>1st line of 2nd verse</p>
        <p>2nd line of 2nd verse</p>
        <p>3rd line of 2nd verse</p>
    </div>
</div>

With this input:

<div class="poem">
    <h4>Headline</h4>
    <p class="center">1</p>
    <p>1st line of 1st verse</p>
    <p>2nd line of 1st verse</p>
    <br/>
    <p class="center">2</p>
    <p>1st line of 2nd verse</p>
    <p>2nd line of 2nd verse</p>
    <page n="100"/>
    <p>3rd line of 2nd verse</p>
</div>

Output:

<div class="poem">
    <div class="strophe">
        <p>1st line of 1st verse</p>
        <p>2nd line of 1st verse</p>
    </div>
    <div class="strophe">
        <p>1st line of 2nd verse</p>
        <p>2nd line of 2nd verse</p>
        <p>3rd line of 2nd verse</p>
    </div>
</div>

So, this stylesheet:

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:template match="div[@class='poems']">
        <xsl:copy>
            <xsl:copy-of select="@*"/>
            <xsl:for-each-group select="*[preceding-sibling::h4]"
                                group-starting-with="h4">
                <div class="poem">
                    <xsl:for-each-group select="current-group()"
                                        group-ending-with="br">
                        <div class="strophe">
                            <xsl:copy-of select="current-group()
                                                  /self::p[not(@class)]"/>
                        </div>
                    </xsl:for-each-group>
                </div>
            </xsl:for-each-group>
        </xsl:copy>
    </xsl:template>
</xsl:stylesheet>

With this input:

<div class="poems">
    <h3>Poems</h3>
    <h4>Headline</h4>
    <p>1st line of 1st verse</p>
    <p>2nd line of 1st verse</p>
    <br/>
    <p>1st line of 2nd verse</p>
    <p>2nd line of 2nd verse</p>
    <page n="100"/>
    <p>3rd line of 2nd verse</p>
    <h4>Headline</h4>
    <p class="center">1</p>
    <p>1st line of 1st verse</p>
    <p>2nd line of 1st verse</p>
    <br/>
    <p class="center">2</p>
    <p>1st line of 2nd verse</p>
    <p>2nd line of 2nd verse</p>
    <page n="100"/>
    <p>3rd line of 2nd verse</p>
</div>

Output:

<div class="poems">
    <div class="poem">
        <div class="strophe">
            <p>1st line of 1st verse</p>
            <p>2nd line of 1st verse</p>
        </div>
        <div class="strophe">
            <p>1st line of 2nd verse</p>
            <p>2nd line of 2nd verse</p>
            <p>3rd line of 2nd verse</p>
        </div>
    </div>
    <div class="poem">
        <div class="strophe">
            <p>1st line of 1st verse</p>
            <p>2nd line of 1st verse</p>
        </div>
        <div class="strophe">
            <p>1st line of 2nd verse</p>
            <p>2nd line of 2nd verse</p>
            <p>3rd line of 2nd verse</p>
        </div>
    </div>
</div>

这篇关于将具有相同名称和相同属性的以下兄弟姐妹分组会导致撒克逊人出现异常的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
其他开发最新文章
热门教程
热门工具
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆