从 XSLT 2 中元素值的语义层次创建父子元素 [英] Creating parent-child elements from semantic hiearchy in element values in XSLT 2

查看:19
本文介绍了从 XSLT 2 中元素值的语义层次创建父子元素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 XML 内容中有一系列 P 标记,它们的起始值内具有语义层次结构,但是,P 标记是线性的.寻找 XSLT 2 转换.

I have a series of P tags in XML content that have a semantic hierarchy within their beginning values, however, the P tags are linear. Looking for the XSLT 2 transformation.

语义层次如下:

(1)
 +-(a)
    +-(I)
       +-(A)

使用正则表达式如下:

<xsl:param name="patternOrder" as="element(pattern)*" xmlns="">
  <pattern level="1" value="^(\([0-9]+(\.[0-9]+)?\))" />
  <pattern level="2" value="^(\([a-z]\))" />
  <pattern level="3" value="^(\((IX|IV|V?I{{0,3}})\))" />
  <pattern level="4" value="^(\([\w]+(\.[\w]+)?\))" />
</xsl>

在查看我的数据集后,我有各种条件:

After review of my dataset, I have the various conditions:

<?xml version="1.0" encoding="UTF-8"?>
<test>
    <content>
        <p>(1) blah</p>
        <p>(2)(a) blah</p>
        <p>(b) blah</p>
        <p>(3)(a)(I) blah</p>
        <p>(II) blah</p>
        <p>(A) blah</p>
        <p>(B.1) blah</p>
        <p>(b) blah</p>
        <p>(4) blah</p>
        <p>(4.5) blah</p>
        <p>(5)(a)(I)(A) blah</p>
        <p>(B) blah</p>
        <p>(II) blah</p>
        <p>(III)(a) blah</p>
        <p>(bb.2) blah</p>
        <p>(6) blah</p>
    </content>
    <content>
        <p>blah</p>
    </content>
    <content>
        <p>blah</p>
        <p>(1) blah</p>
        <p>(a) blah</p>
        <p>(b) blah</p>
        <p>(2) blah </p>
    </content>
</test>

...最终结果应该是:

...and end results should be:

<?xml version="1.0" encoding="UTF-8"?>
<test>
    <content>
        <p>(1) blah</p>
        <p>(2)
            <p>(a) blah</p>
            <p>(b) blah</p>
        </p>
        <p>(3)
            <p>(a)
                <p>(I) blah</p>
                <p>(II) blah
                    <p>(A) blah</p>
                    <p>(B) blah</p>
                </p>
            </p>
            <p>(b) blah</p>
        </p>
        <p>(4) blah</p>
        <p>(4.5) blah</p>
        <p>(5)
            <p>(a)
                <p>(I)
                    <p>(A) blah</p>
                    <p>(B.1) blah</p>
                </p>      
                <p>(II) blah</p>
                <p>(III)</p>
                    <p>(a) blah</p>
                    <p>(bb.2) blah</p>
                </p>
            </p>
        <p>(6) blah</p>
    </content>
    <content>
        blah
    </content>
    <content>
        blah
        <p>(1) blah
            <p>(a) blah</p>
            <p>(b) blah</p>      
        </p>
        <p>(2) blah </p>
    </content>
</test>

请注意如果语义层次结构不存在于 P 标签中的条件 - 则 P 标签被删除并且是其父内容元素的值.

Please note the condition if the semantic hierarchy is not present in the P tag - then the P tag is removed and is a value of its parent content element.

我已经能够使用以下正则表达式检测所有语义条件:

I have been able to detect all the semantic conditions using the following RegEx:

^(\(([\w]+(\.[\w]+)?)\)){1,4}

*编辑 #2 *

具有调平属性:

*EDIT #2 *

With the leveling attributes:

<?xml version="1.0" encoding="UTF-8"?>
<test>
    <content>
        <p level="1">(1) blah</p>
        <p level="1">(2)</p>
        <p level="2">(a) blah</p>
        <p level="2">(b) blah</p>
        <p level="1">(3)</p>
        <p level="2">(a)</p>
        <p level="3">(I) blah</p>
        <p level="3">(II) blah</p>
        <p level="4">(A) blah</p>
        <p level="4">(B.1) blah</p>
        <p level="2">(b) blah</p>
        <p level="1">(4) blah</p>
        <p level="1">(4.5) blah</p>
        <p level="1">(5)</p>
        <p level="2">(a)</p>
        <p level="3">(I)</p>
        <p level="4">(A) blah</p>
        <p level="4">(B) blah</p>
        <p level="3">(II) blah</p>
        <p level="3">(III)</p>
        <p level="2">(a) blah</p>
        <p level="2">(bb.2) blah</p>
        <p level="2">(6) blah</p>
    </content>
    <content>
        <p>blah</p>
    </content>
    <content>
        <p>blah</p>
        <p level="1">(1) blah</p>
        <p level="2">(a) blah</p>
        <p level="2">(b) blah</p>
        <p level="1">(2) blah </p>
    </content>
</test>

推荐答案

第一阶段:改造

<p>(2)(a) blah</p>
<p>(b) blah</p>

进入

<p>(2)</p>
<p>(a) blah</p>
<p>(b) blah</p>

使用类似的东西

<xsl:template match="p">
  <xsl:for-each select="tokenize(., '\(')">
     <xsl:if test="normalize-space(.)">
       <p>(<xsl:value-of select="."/></p>
     </xsl:if>
  </xsl:for-each>
</xsl:template>

第二阶段:

先写一个函数

<xsl:function name="f:level" as="xs:integer">
  <xsl:param name="p" as="element(p)"/>
  ....
</xsl:function>

根据匹配您的正则表达式计算语义级别".你似乎知道如何做这部分.

which computes the "semantic level" based on matching your regular expressions. You seem to know how to do this part.

然后写一个递归分组函数:

Then write a recursive grouping function:

<xsl:function name="f:group" as="element(p )*">
  <xsl:param name="in" as="element(p )*"/>
  <xsl:param name="level" as="xs:integer"/>
  <xsl:for-each-group select="$in" group-starting-with="p[f:level(.)=$level]">
    <p><xsl:value-of select="current-group()[1]"/>
      <xsl:sequence select="f:group(current-group()[position() gt 1], $level+1)"/>
    </p>
  </xsl:for-each-group>
</xsl:function>

并像这样调用这个函数:

and call this function like this:

<xsl:template match="content">
  <xsl:sequence select="f:group(p, 1)"/>
</xsl:template>

未测试.

这篇关于从 XSLT 2 中元素值的语义层次创建父子元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆