使用XPath选择具有特定属性值的连续元素 [英] using XPath to select contiguous elements with a certain attribute value

查看:113
本文介绍了使用XPath选择具有特定属性值的连续元素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有这样的XML:

<span>1</span>
<span class="x">2</span>
<span class="x y">3</span>
<span class="x">4</span>
<span>5</span>
<span class="x">6</span>
<span>7</span>
<span class="x">8</span>

我想要的是使用XSLT样式表将class属性包含x的所有元素的内容放入一个<x>元素.所以输出应该是这样的:

What I want is to use an XSLT stylesheet to put the contents of all elements whose class attribute contains x into one <x> element. So the output should be like this:

1 <x>234</x> 5 <x>6</x> 7 <x>8</x>

(或者理想情况下,

1 <x>2<y>3</y>4</x> 5 <x>6</x> 7 <x>8</x>

但是当我解决了这个问题时,这个问题就解决了.)

but that's a problem to tackle when I've solved this one.)

这是我的XSLT的相关片段:

This is the relevant fragment of my XSLT:

<xsl:template match="span[contains(@class,'x') and preceding-sibling::span[1][not(contains(@class,'x'))]]">
  <x><xsl:for-each select=". | following-sibling::span[contains(@class,'x')]">
    <xsl:value-of select="text()"/>
  </xsl:for-each></x>
</xsl:template>

<xsl:template match="span[contains(@class,'x') and preceding-sibling::span[1][contains(@class,'x')]]">
</xsl:template>

<xsl:template match="span">
  <xsl:value-of select="text()"/>
</xsl:template>

这产生的是:

1 <x>23468</x> 5 <x>68</x> 7 <x>8</x>

我很确定我必须在XPath表达式中使用一个计数,以便它不会选择所有x类以下的元素,而只是选择连续的元素.但是如何计算连续的呢?还是我做错了方法?

I'm pretty sure I have to use a count in the XPath expression so that it doesn't select all of the following elements with class x, just the contiguous ones. But how can I count the contiguous ones? Or am I doing this the wrong way?

推荐答案

这很棘手,但可行(请提前阅读,对此表示抱歉).

This is tricky, but doable (long read ahead, sorry for that).

根据XPath轴(按定义,它们不是连续的),连续性"的关键是检查相反"方向上最接近的节点首先满足条件"是否也是一个开始"了手边的系列的人:

The key to "consecutiveness" in terms of XPath axes (which are by definition not consecutive) is to check whether the closest node in the opposite direction that "first fulfills the condition" also is the one that "started" the series at hand:


a
b  <- first node to fulfill the condition, starts series 1
b  <- series 1
b  <- series 1
a
b  <- first node to fulfill the condition, starts series 2
b  <- series 2
b  <- series 2
a

在您的情况下,系列由<span>个节点组成,这些节点的@class中具有字符串x:

In your case, a series consists of <span> nodes that have the string x in their @class:

span[contains(concat(' ', @class, ' '),' x ')] 

请注意,我会合并空格以避免误报.

一个开始一系列的<span>(即首先满足条件"的一个)可以定义为在其类中具有x并且没有直接跟在另一个也具有一个<span><span>之前的一个. x:

A <span> that starts a series (i.e. one that "first fulfills the condition") can be defined as one that has an x in its class and is not directly preceded by another <span> that also has an x:

not(preceding-sibling::span[1][contains(concat(' ', @class, ' '),' x ')])

我们必须在<xsl:if>中检查此情况,以避免模板为一系列节点生成输出(即,模板仅对启动节点"进行实际工作).

We must check this condition in an <xsl:if> to avoid that the template generates output for nodes that are in a series (i.e., the template will do actual work only for "starter nodes").

现在到了棘手的部分.

Now to the tricky part.

我们必须从这些启动节点"中的每一个中选择所有在其类中具有xfollowing-sibling::span节点.还包括当前的span以说明仅包含一个元素的系列.好吧,很简单:

From each of these "starter nodes" we must select all following-sibling::span nodes that have an x in their class. Also include the current span to account for series that only have one element. Okay, easy enough:

. | following-sibling::span[contains(concat(' ', @class, ' '),' x ')]

对于每个这些,我们现在找出与它们最接近的启动节点"是否与模板正在处理的起始节点"(即,开始于其 系列)相同. .这意味着:

For each of these we now find out if their closest "starter node" is identical to the one that the template is working on (i.e. that started their series). This means:

  • 它们必须是系列的一部分(即,它们必须跟随span并带有x)

preceding-sibling::span[1][contains(concat(' ', @class, ' '),' x ')]

  • 现在删除其启动器节点与 current 系列启动器不同的所有span.这意味着我们会检查任何前一个同级span(具有x),而其本身并没有直接在span之前带有x:

  • now remove any span whose starter node is not identical to the current series starter. That means we check any preceding-sibling span (that has an x) which itself is not directly preceded by a span with an x:

    preceding-sibling::span[contains(concat(' ', @class, ' '),' x ')][
      not(preceding-sibling::span[1][contains(concat(' ', @class, ' '),' x ')])
    ][1]
    

  • 然后我们使用generate-id()检查节点身份.如果找到的节点与$starter相同,则当前跨度是属于连续序列的那个.

  • Then we use generate-id() to check node identity. If the found node is identical to $starter, then the current span is one that belongs to the consecutive series.

    将它们放在一起:

    <xsl:template match="span[contains(concat(' ', @class, ' '),' x ')]">
      <xsl:if test="not(preceding-sibling::span[1][contains(concat(' ', @class, ' '),' x ')])">
        <xsl:variable name="starter" select="." />
        <x>
          <xsl:for-each select="
            . | following-sibling::span[contains(concat(' ', @class, ' '),' x ')][
              preceding-sibling::span[1][contains(concat(' ', @class, ' '),' x ')]
              and
              generate-id($starter)
              =
              generate-id(
                preceding-sibling::span[contains(concat(' ', @class, ' '),' x ')][
                  not(preceding-sibling::span[1][contains(concat(' ', @class, ' '),' x ')])
                ][1]
              )
            ]
          ">
            <xsl:value-of select="text()" />
          </xsl:for-each>
        </x>
      </xsl:if>
    </xsl:template>
    

    是的,我知道它并不漂亮. Dimitre的答案显示了一种基于<xsl:key>的解决方案,该解决方案效率更高.

    And yes, I know it's not pretty. There is an <xsl:key> based solution that is more efficient, Dimitre's answer shows it.

    使用示例输入,将生成以下输出:

    With your sample input, this output is generated:

    1
    <x>234</x>
    5
    <x>6</x>
    7
    <x>8</x>
    

    这篇关于使用XPath选择具有特定属性值的连续元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

  • 查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆