如何拆分文本和保留HTML标签(XSLT 2.0) [英] How to split text and preserve HTML tags (XSLT 2.0)

查看:91
本文介绍了如何拆分文本和保留HTML标签(XSLT 2.0)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个带有描述节点的xml:

I have an xml that has a description node:

<config>
  <desc>A <b>first</b> sentence here. The second sentence with some link <a href="myurl">The link</a>. The <u>third</u> one.</desc>
</config>

我正在尝试使用点作为分隔符来拆分句子,但要在HTML输出中同时保留最终的HTML标签. 到目前为止,我得到的是一个用于拆分描述的模板,但由于normalize-space和substring-before函数的缘故,HTML标记在输出中丢失了. 我当前的模板如下:

I am trying to split the sentences using dot as separator but keeping in the same time in the HTML output the eventual HTML tags. What I have so far is a template that splits the description but the HTML tags are lost in the output due to the normalize-space and substring-before functions. My current template is given below:

<xsl:template name="output-tokens">
  <xsl:param name="sourceText" />

  <!-- Force a . at the end -->
  <xsl:variable name="newlist" select="concat(normalize-space($sourceText), ' ')" />
  <!-- Check if we have really a point at the end -->
  <xsl:choose>
    <xsl:when test ="contains($newlist, '.')">
      <!-- Find the first . in the string -->
      <xsl:variable name="first" select="substring-before($newlist, '.')" />

      <!-- Get the remaining text -->
      <xsl:variable name="remaining" select="substring-after($newlist, '.')" />
      <!-- Check if our string is not in fact a . or an empty string -->
      <xsl:if test="normalize-space($first)!='.' and normalize-space($first)!=''">
        <p><xsl:value-of select="normalize-space($first)" />.</p>
      </xsl:if>
      <!-- Recursively apply the template for the remaining text -->
      <xsl:if test="$remaining">
        <xsl:call-template name="output-tokens">
          <xsl:with-param name="sourceText" select="$remaining" />
        </xsl:call-template>
      </xsl:if>
    </xsl:when>
    <!--If no . was found -->
    <xsl:otherwise>
      <p>
        <!-- If the string does not contains a . then display the text but avoid 
           displaying empty strings 
         -->
        <xsl:if test="normalize-space($sourceText)!=''">
          <xsl:value-of select="normalize-space($sourceText)" />.
        </xsl:if>
      </p>
    </xsl:otherwise>
  </xsl:choose>
</xsl:template>

并且我以以下方式使用它:

and I am using it in the following manner:

<xsl:template match="config">
  <xsl:call-template name="output-tokens">
       <xsl:with-param name="sourceText" select="desc" />
  </xsl:call-template>
</xsl:template>

预期输出为:

<p>A <b>first</b> sentence here.</p>
<p>The second sentence with some link <a href="myurl">The link</a>.</p>
<p>The <u>third</u> one.</p>

推荐答案

这是实现第二种方法的一种方法

Here is one way to implement the second approach suggested by Michael Kay using XSLT 2.

此样式表演示了两遍转换,其中第一遍在每个句子后引入<stop/>标记,第二遍将所有以<stop/>结尾的组括起来.

This stylesheet demonstrates a two-pass transformation where the first pass introduces <stop/> markers after each sentence and the second pass encloses all groups ending with a <stop/> in a paragraph.

<xsl:stylesheet version="2.0" 
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

  <xsl:output method="xml" indent="yes"/>

  <!-- two-pass processing -->
  <xsl:template match="/">
    <xsl:variable name="intermediate">
      <xsl:apply-templates mode="phase-1"/>
    </xsl:variable>
    <xsl:apply-templates select="$intermediate" mode="phase-2"/>
  </xsl:template>

  <!-- identity transform -->
  <xsl:template match="@*|node()" mode="#all" priority="-1">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()" mode="#current"/>
    </xsl:copy>
  </xsl:template>

  <!-- phase 1 -->

  <!-- insert <stop/> "milestone markup" after each sentence -->
  <xsl:template match="text()" mode="phase-1">
    <xsl:analyze-string select="." regex="\.\s+">
      <xsl:matching-substring>
        <xsl:value-of select="regex-group(0)"/>
        <stop/>
      </xsl:matching-substring>
      <xsl:non-matching-substring>
        <xsl:value-of select="."/>
      </xsl:non-matching-substring>
    </xsl:analyze-string>
  </xsl:template>

  <!-- phase 2 -->

  <!-- turn each <stop/>-terminated group into a paragraph -->
  <xsl:template match="*[stop]" mode="phase-2">
    <xsl:copy>
      <xsl:for-each-group select="node()" group-ending-with="stop">
        <p>
          <xsl:apply-templates select="current-group()" mode="#current"/>
        </p>
      </xsl:for-each-group>
    </xsl:copy>
  </xsl:template>

  <!-- remove the <stop/> markers -->
  <xsl:template match="stop" mode="phase-2"/>

</xsl:stylesheet>

这篇关于如何拆分文本和保留HTML标签(XSLT 2.0)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆