为什么我的XSLT在这里剥离HTML标记 [英] Why is my XSLT here stripping HTML tags

查看:54
本文介绍了为什么我的XSLT在这里剥离HTML标记的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用XSLT 1.0以便将一些XML转换为JSON输出.不幸的是,我正在使用的某些XML中包含HTML标记.这是一些XML输入的示例:

I am using XSLT 1.0 in order to convert some XML into JSON output. Unfortunately some of the XML I'm working with has HTML markup in it. Here's an example of some XML input:

 <text>
 Kevin Love and Steph Curry can talk about their first-
 time starting gigs in the All-Star game Friday night when the Minnesota
 Timberwolves visit Oracle Arena to face the Golden State Warriors.
</text>
  <continue>
    <P>
 Love and Curry were two of four first-time All-Star starters when the league
 made the announcement on Thursday.
</P>
    <P>
 Love got a late push to overtake Houston Rockets center Dwight Howard in the
 final week of voting.
</P>
    <P>
 "I think it's a little sweeter this way because I really didn't expect it,"
 Love said on a conference call. "I was already humbled by the response the
 fans gave me to being very close to the top (frontcourt players). The outreach
 by the Minnesota fans and beyond was truly amazing."
</P>
</continue>

标记不是理想的,我需要在JSON输出中保留<P>标记.为了处理报价,我将其转义.这是我用来处理此问题的模板:

The markup is not ideal and I need to retain the <P> tags in my JSON output. In order to deal with quotes, I escape them. Here's my template for handling this:

<xsl:variable name="escaped-continue">
      <xsl:call-template name="replace-string">
        <xsl:with-param name="text" select="continue"/>
        <xsl:with-param name="replace" select="'&quot;'" />
        <xsl:with-param name="with" select="'\&quot;'"/>
      </xsl:call-template>
    </xsl:variable>
     <xsl:variable name="escaped-text">
      <xsl:call-template name="replace-string">
        <xsl:with-param name="text" select="text"/>
        <xsl:with-param name="replace" select="'&quot;'" />
        <xsl:with-param name="with" select="'\&quot;'"/>
      </xsl:call-template>
    </xsl:variable>
 <xsl:template name="replace-string">
        <xsl:param name="text"/>
        <xsl:param name="replace"/>
        <xsl:param name="with"/>
        <xsl:choose>
            <xsl:when test="contains($text,$replace)">
                <xsl:value-of select="substring-before($text,$replace)"/>
                <xsl:value-of select="$with"/>
                <xsl:call-template name="replace-string">
                    <xsl:with-param name="text"
                        select="substring-after($text,$replace)"/>
                    <xsl:with-param name="replace" select="$replace"/>
                    <xsl:with-param name="with" select="$with"/>
                </xsl:call-template>
            </xsl:when>
            <xsl:otherwise>
                <xsl:value-of select="$text"/>
            </xsl:otherwise>
        </xsl:choose>
   </xsl:template>

然后我只需使用类似以下的内容来输出JSON:

I then simply use something like the following to output JSON:

{
    "text": "<xsl:value-of select="normalize-space($escaped-text)"/>", 
    "continue": "<xsl:value-of select="normalize-space($escaped-continue)"/>"
}

我在这里遇到的问题是输出看起来像这样:

The issue I have here is that the output looks like this:

{
 "text": "Kevin Love and Steph Curry can talk about their first- time starting gigs in the All-Star game Friday night when the Minnesota Timberwolves visit Oracle Arena to face the Golden State Warriors.", 
  "continue": "Love and Curry were two of four first-time All-Star starters when the league made the announcement on Thursday. Love got a late push to overtake Houston Rockets center Dwight Howard in the final week of voting. \"I think it's a little sweeter this way because I really didn't expect it,\" Love said on a conference call. \"I was already humbled by the response the fans gave me to being very close to the top (frontcourt players). The outreach by the Minnesota fans and beyond was truly amazing.\"
}

如您所见,双引号已正确转义,但是<P>标记已由XSLT解析器直接剥离和/或解析,然后由normalize-space()抑制.将<P>标记重新添加到我的输出中的最佳方法是什么?

As you can see, double quotes are properly escaped, however the <P> tags have been stripped and/or parsed directly by the XSLT parser and then suppressed by normalize-space(). What's the best way to re-add the <P> tags into my output here?

推荐答案

尝试一下:

<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform" >
<xsl:output method="xml" encoding="utf-8" omit-xml-declaration="yes" />

<xsl:template match="/root">
    <xsl:text>{&#10;"text": "</xsl:text>
    <xsl:apply-templates select="text/text()"/>
    <xsl:text>"&#10;"continue": "</xsl:text>
    <xsl:apply-templates select="continue/*"/>
    <xsl:text>"&#10;}</xsl:text>
</xsl:template>

<xsl:template match="*">
    <xsl:copy>
        <xsl:apply-templates/>
    </xsl:copy>
</xsl:template>

<xsl:template match="text()">
<xsl:variable name="escaped-text">
    <xsl:call-template name="replace-string">
        <xsl:with-param name="text" select="."/>
        <xsl:with-param name="replace" select="'&quot;'" />
        <xsl:with-param name="with" select="'\&quot;'"/>
    </xsl:call-template>
</xsl:variable>
<xsl:value-of select="normalize-space($escaped-text)"/>
</xsl:template>

<xsl:template name="replace-string">
    <xsl:param name="text"/>
    <xsl:param name="replace"/>
    <xsl:param name="with"/>
    <xsl:choose>
        <xsl:when test="contains($text,$replace)">
            <xsl:value-of select="substring-before($text,$replace)"/>
            <xsl:value-of select="$with"/>
            <xsl:call-template name="replace-string">
                <xsl:with-param name="text"
                    select="substring-after($text,$replace)"/>
                <xsl:with-param name="replace" select="$replace"/>
                <xsl:with-param name="with" select="$with"/>
            </xsl:call-template>
        </xsl:when>
        <xsl:otherwise>
            <xsl:value-of select="$text"/>
        </xsl:otherwise>
    </xsl:choose>
</xsl:template>

</xsl:stylesheet>

应用于输入的修改版本(添加了根元素和一些用于测试的标记):

Applied to a modified version of your input (added root element and some more markup for testing):

<root>
    <text>
    Kevin Love and Steph Curry can talk about their first-
    time starting gigs in the All-Star game Friday night when the Minnesota
    Timberwolves visit Oracle Arena to face the Golden State Warriors.
    </text>
    <continue>
        <P>
        Love and Curry were <i>two of <b>four</b> first-time All-Star</i> starters when the league
        made the announcement on Thursday.
        </P>
        <P>
        Love got a late push to overtake Houston Rockets center Dwight Howard in the
        final week of voting.
        </P>
        <P>
        "I think it's a little sweeter this way because I really didn't expect it,"
        Love said on a conference call. "I was already humbled by the response the
        fans gave me to being very close to the top (frontcourt players). The outreach
        by the Minnesota fans and beyond was truly amazing."
        </P>
    </continue>
</root>

产生以下结果:

{
"text": "Kevin Love and Steph Curry can talk about their first- time starting gigs in the All-Star game Friday night when the Minnesota Timberwolves visit Oracle Arena to face the Golden State Warriors."
"continue": "<P>Love and Curry were<i>two of<b>four</b>first-time All-Star</i>starters when the league made the announcement on Thursday.</P><P>Love got a late push to overtake Houston Rockets center Dwight Howard in the final week of voting.</P><P>\"I think it's a little sweeter this way because I really didn't expect it,\" Love said on a conference call. \"I was already humbled by the response the fans gave me to being very close to the top (frontcourt players). The outreach by the Minnesota fans and beyond was truly amazing.\"</P>"
}

这篇关于为什么我的XSLT在这里剥离HTML标记的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆