使用 for-each-group 实现高性能 XSLT [英] Using for-each-group for high performance XSLT

查看:21
本文介绍了使用 for-each-group 实现高性能 XSLT的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个 XSLT (1.0) 样式表.它没有问题.我想升级到 2.0.我想使用 xsl:for-each-group (并使其具有高性能).有可能的?如何?请解释.

I have an XSLT (1.0) style sheet. It works with no problem. I want to make it to 2.0. I want to use xsl:for-each-group (and make it have high performance). It is possible? How? Please explain.

我有很多地方喜欢

    <xsl:if test="test condition">
     <xsl:for-each select="wo:tent">
     <width aidwidth='{/wo:document/styles [@wo:name=current()/@wo:style-name]/@wo:width}'
</xsl:for-each>
    </xsl:if>

添加

<xsl:template match="wo:country">
            <xsl:for-each select="@*">
                <xsl:copy/>
            </xsl:for-each>
            <xsl:variable name="states" select="wo:pages[@xil:style = &quot;topstates&quot; or @xil:style = &quot;toppage-title&quot;]"/>
            <xsl:variable name="provinces" select="wo:pages[@xil:style = &quot;topprovinces&quot;]"/>
            <xsl:choose>
                <xsl:when test="$states">
                    <xsl:apply-templates select="$states[2]/preceding-sibling::*"/>
                    <xsl:apply-templates select="$states[2]" mode="states">
                        <xsl:with-param name="states" select="$states[position() != 0]"/>
                    </xsl:apply-templates>
                </xsl:when>
                <xsl:when test="$provinces">
                    <xsl:apply-templates select="$provinces[2]/preceding-sibling::*"/>
                    <xsl:apply-templates select="$provinces[2]" mode="provinces">
                        <xsl:with-param name="provinces" select="$provinces[position() != 2]"/>
                    </xsl:apply-templates>
                </xsl:when>
                <xsl:otherwise>
                    <xsl:apply-templates/>
                </xsl:otherwise>
            </xsl:choose>
    </xsl:template>

来源

<?xml version="1.0" encoding="UTF-8"?>
<wo:country>
   some stuff
</wo:country>

推荐答案

我假设您想深入了解 xsl:for-each-group 以及如何使用它.如果这不是您要的,请告诉我.

I have assumed that you want an in-depth description of xsl:for-each-group and how to use it. If this is not what you are asking for, then please let me know.

该指令是 XSLT 2.0 中的新指令,它接受一组项目并将它们分组.项目集称为人口",而组仅称为组.指令依次处理每一组.

The instruction, new in XSLT 2.0, takes a set of items and groups them. The set of items is called "the population", and the groups are just called groups. The instruction processes each group in turn.

xsl:for-each-group 指令的可能属性包括:

Possible attributes of the xsl:for-each-group instruction include:

  1. 选择
  2. 分组
  3. 组相邻
  4. group-starting-with
  5. group-ending-with
  6. 整理

@select 是强制性的.其他是可选的.它可以采用任意数量的 xsl:sort 子项(但它们必须先出现),然后是一个序列构造函数.序列构造函数"是指所有序列发出类型指令的术语,这些指令进入模板等.

@select is mandatory. The others are optional. It can take any number of xsl:sort children (but they must come first), followed by a sequence constructor. A "sequence constructor" is the term for all the sequence emitting type instructions that go inside templates and the like.

select 属性指定一个 XPATH 表达式,该表达式的计算结果为要分组的总体.

The select attribute specifies an XPATH expression which evaluates to the population to be grouped.

group-by 属性指定一个 XPATH 表达式,当分组类型是按公共值时使用该表达式.总体中评估为与另一个相同的分组值的每个项目都与另一个项目在同一组中.

The group-by attribute specifies an XPATH expression, which you use when the type of grouping is by common value. Every item in the population that evaluates to the same group-by value as another is in the same group as that other.

XSLT 1.0 Muenchian 分组在分组类型为按公共值分组时并不太困难.有两种更常见的分组形式:按相似值对相邻项目进行分组;并将相邻的一组项目分组,这些项目的组要么在结尾处,要么在开头处通过某种测试进行划分.虽然这两种形式的分组仍然可以使用 Muenchian,但它变得相对复杂.由于使用了兄弟轴(无论如何拼写!),这些类型的 Muenchian 在规模上的效率也会降低.

XSLT 1.0 Muenchian grouping is not too difficult when the type of grouping is group by common value. There are two more common forms of grouping: group adjacent items by similar value; and group an adjacent group of items whose group is either demarcated at the end or the at the beginning by some test. While both these forms of grouping are still possible with Muenchian, it becomes relatively complex. Muenchian on these types will also be less efficient at scale, because of the use of sibling axises (however you spell that!).

想到的 XSLT 2.0 的另一个优点是 Muenchian 仅适用于节点集,而 xsl:for-each-group 的应用范围更广,因为它适用于一系列项目,而不适用于只是节点.

Another advantage of XSLT 2.0 that comes to mind is that Muenchian only works on node sets, whereas xsl:for-each-group is broader in application because it works on a sequence of items, not just nodes.

@group-by 表达式的结果将是一个项目序列.该序列被原子化和重复数据删除.被测试的总体项目将是每个值的一个组的成员.这是一个奇怪的结果,使用@group-by 和 item 可能是多个组的成员,甚至可能没有.尽管我怀疑您可以在 XSLT 2.0 中做的任何事情,通过一些曲折的路径,您可以在 XSLT 1.0 中做,但将项目分成两组的能力在 XSLT 1.0 Muenchian 中是一件很安静的事情.

The result of the @group-by expression will be a sequence of items. This sequence is atomized and de-duped. The population item being tested will be a member of one group per value. It's a strange consequence, that with @group-by, and item may be a member of more than one group, or perhaps even none. Although I suspect that any thing that you can do in XSLT 2.0, you can, by some tortuous path, do in XSLT 1.0, the ability to put an item into two groups is something that would be quiet fiddly to do in XSLT 1.0 Muenchian.

属性 group-by、group-adjacent、group-starting-with 和 group-ending-with 是互斥的,因为它们指定了不同种类的分组.具有公共值且在总体中相邻的项目被分组在一起.与@group-by 不同,@group-adjacent 必须在原子化后评估为单个原子值.

The attributes group-by, group-adjacent, group-starting-with and group-ending-with are mutually exclusive because they specify different kinds of grouping. Items with commons values and adjacent in the population are grouped together. Unlike @group-by, @group-adjacent must evaluate to, after atomization, a single atomic value.

与 select、group-adjacent 和 group-by 不同,此属性不指定 XPATH 选择表达式,而是指定模式,与 xsl:template/@match 指定模式相同,不是强>一个选择.如果总体中的一个项目通过了模式测试或者是总体中的第一个项目,那么它将开始一个新的组.否则,该项目会从上一个项目继续该组.

Unlike select, group-adjacent and group-by, this attribute does not specify an XPATH select expression, but rather a pattern, in the same way the xsl:template/@match specifies a pattern, not a selection. If an item in the population passes the pattern test or is the first item in the population then it starts a new group. Otherwise the item continues the group from the previous item.

Martin 提到了规范示例 (w3.org/TR/xslt20/#grouping-example).从该参考文献中,我将复制题为通过其初始元素识别组"的示例,但对其稍作修改以强调有关总体初始项的观点.

Martin mentioned the spec examples (w3.org/TR/xslt20/#grouping-example). From that reference, I am going to copy the example entitled "Identifying a Group by its Initial Element", but alter it slightly to emphasis the point about the initial item of the population.

所以这是我们的输入文档(从 w3 规范复制.孤行的包含是我的)...

So this is our input document (copied from w3 spec. The inclusion of the orphaned line is mine) ...

<body>
  <p>This is an orphaned paragraph.</p>
  <h2>Introduction</h2>
  <p>XSLT is used to write stylesheets.</p>
  <p>XQuery is used to query XML databases.</p>
  <h2>What is a stylesheet?</h2>
  <p>A stylesheet is an XML document used to define a transformation.</p>
  <p>Stylesheets may be written in XSLT.</p>
  <p>XSLT 2.0 introduces new grouping constructs.</p>
</body>

...我们想要做的是将组定义为以 h2 开头的节点,并包括所有后续 p 直到下一个 h2.w3 给出的示例解决方案是使用 @group-starting-with ...

... what we want to do is define groups as nodes starting with h2 and include all the following p up until the next h2. The example solution given by w3 is to use @group-starting-with ...

<xsl:template match="body">
  <chapter>
        <xsl:for-each-group select="*" group-starting-with="h2"      >
          <section title="{self::h2}">
            <xsl:for-each select="current-group()[self::p]">
              <para><xsl:value-of select="."/></para>
            </xsl:for-each> 
          </section>
        </xsl:for-each-group>
  </chapter>
</xsl:template>

在规范示例中,当输入不包含孤行时,这会产生所需的结果...

In the spec example, when the input does not contain an orphan line, this produces the desired result ...

<chapter>
  <section title="Introduction">
    <para>XSLT is used to write stylesheets.</para>
    <para>XQuery is used to query XML databases.</para>
  </section> 
  <section title="What is a stylesheet?">
    <para>A stylesheet is an XML document used to define a transformation.</para>
    <para>Stylesheets may be written in XSLT.</para>
    <para>XSLT 2.0 introduces new grouping constructs.</para>
  </section>
</chapter>

虽然在我们的特殊情况下,我们得到了......

Although in our particular case we get instead ...

<chapter>
   <section title="">
      <para>This is an orphaned paragraph.</para>
   </section>
   <section title="Introduction">
      <para>XSLT is used to write stylesheets.</para>
      <para>XQuery is used to query XML databases.</para>
   </section>
   <section title="What is a stylesheet?">
      <para>A stylesheet is an XML document used to define a transformation.</para>
      <para>Stylesheets may be written in XSLT.</para>
      <para>XSLT 2.0 introduces new grouping constructs.</para>
   </section>
</chapter>

如果不希望出现孤立线的初始部分,则有简单的解决方案.我现在不会进入它们.我的观点只是强调这样一个事实,即@group-starting-with 产生的第一个组可能是一个孤儿"组.孤儿"是指头节点不符合指定模式的组.

If the initial section for the orphaned lines is undesired, there are easy solutions. I won't go into them now. My point is just to high-light the fact that the first group resulting from @group-starting-with can be an 'orphan' group. By 'orphan', I mean a group whose head node does not fit the specified pattern.

collat​​ion 属性指定排序规则 URI 并标识用于比较字符串是否相等的排序规则.

The collation attribute specifies a collation URI and identifies a collation used to compare strings for equality.

在 xsl:for-each-group 中,current-group() 函数返回作为项目序列处理的当前组.

Within the xsl:for-each-group the current-group() function returns the current group being processed as a sequence of items.

在 xsl:for-each-group 中,current-group() 函数返回当前组键.我不确定,但我相信这只能是原子类型.也不确定,但我相信这个功能只适用于@group-by 和@group-adjacent 类型的分组.

Within the xsl:for-each-group the current-group() function returns the current group key. I am not sure, but I believe that this can only be an atomic type. Also not sure, but I believe that this function is only applicable to @group-by and @group-adjacent type of grouping.

在某些情况下,您可以在具有相同功能结果的这两种排序类型之间进行选择.在这种情况下,@group-adjacent 比 @group-by 更受欢迎,因为它的处理效率可能更高.

In some scenarios you will have a choice between these two sort types with the same functional result. When this is the case @group-adjacent is to be preferred over @group-by, because it will likely be more efficient to process.

某些 XSLT 2.0 指令属性包含选择表达式.Michael Kay 称这些为XPath 表达式".就个人而言,当与模式并列时,我觉得更好的描述是选择表达式".其他属性包含模式或匹配表达式".虽然这两者都包含相同的语法,但它们是非常不同的野兽.两者之间的相似性常常使 XSLT 初学者将 xsl:template/@match 不是一种模式,而是一种选择表达式.结果是初学者对模板序列构造函数中 position() 函数的值产生了很多混淆.如前所述,在 xsl:for-each-group 中,@select、@group-by 和 @group-adjacent 是选择表达式,但 @group-starting-with 和 @group-ending-with 是模式.所以这就是区别:

Some XSLT 2.0 instruction attributes contain select expressions. Michael Kay calls these "XPath expressions". Personally, when juxtaposing against patterns, I feel a better description would be "select expression". Other attributes contain patterns or "match expressions". While these two both contain the same syntax, they are very different beasts. The similarity between the two often makes XSLT beginners think of xsl:template/@match not as a pattern, but as a select expression. The consequence has been a lot of confusion from beginners about the value of the position() function within template's sequence constructors. As stated earlier, in xsl:for-each-group, @select, @group-by and @group-adjacent are select expressions, but @group-starting-with and @group-ending-with are patterns. So here is the difference:

  1. Select 表达式就像一个函数.输入是上下文文档、上下文序列、上下文项、上下文位置,当然还有实际的表达式.输出是一系列项目.根据实际使用的位置,这可能成为下一个上下文序列.默认轴是 child:: .
  2. 与 select 表达式不同,模式的默认轴是 self:: .模式也像一个函数.它的输入和以前一样,它的输出不是一个序列,而是一个布尔值.正在测试某些项目以查看它是否与模式匹配.被测试的项目成为上下文项目.匹配表达式被临时评估,因为它是一个选择表达式.然后测试返回的序列以查看上下文项是否是成员.然后丢弃返回的序列.如果它是成员,则结果为真或匹配",否则为假.

这篇关于使用 for-each-group 实现高性能 XSLT的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆