使用 XSLT 转换 XML 时保留实体引用? [英] Preserving entity references when transforming XML with XSLT?

查看:32
本文介绍了使用 XSLT 转换 XML 时保留实体引用?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用 XSLT (2.0) 转换 XML 时如何保留实体引用?对于我尝试过的所有处理器,默认情况下都会解析实体.我可以使用 xsl:character-map 来处理字符实体,但是文本实体呢?

How can I preserve entity references when transforming XML with XSLT (2.0)? With all of the processors I've tried, the entity gets resolved by default. I can use xsl:character-map to handle the character entities, but what about text entities?

例如,这个 XML:

<!DOCTYPE doc [
<!ENTITY so "stackoverflow">
<!ENTITY question "How can I preserve the entity reference when transforming with XSLT??">
]>
<doc>
  <text>Hello &so;!</text>
  <text>&question;</text>
</doc>

使用以下 XSLT 进行转换:

transformed with the following XSLT:

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output indent="yes"/>
  <xsl:strip-space elements="*"/>

  <xsl:template match="node()|@*">
    <xsl:copy>
      <xsl:apply-templates select="node()|@*"/>
    </xsl:copy>
  </xsl:template>

</xsl:stylesheet>

产生以下输出:

<doc>
   <text>Hello stackoverflow!</text>
   <text>How can I preserve the entity reference when transforming with XSLT??</text>
</doc>

输出应该看起来像输入(现在减去 doctype 声明):

The output should look like the input (minus the doctype declaration for now):

<doc>
  <text>Hello &so;!</text>
  <text>&question;</text>
</doc>

希望我不必通过用 &amp;(如 &amp;question;),然后通过将所有 & 替换为 & 对输出进行后处理.

I'm hoping that I don't have to pre-process the input by replacing all ampersands with &amp; (like &amp;question;) and then post-process the output by replacing all &amp; with &.

也许这是特定于处理器的?我使用的是 Saxon 9.

Maybe this is processor specific? I'm using Saxon 9.

谢谢!

推荐答案

如果您知道将使用哪些实体以及它们是如何定义的,您可以执行以下操作(非常原始且容易出错),但总比没有好):

If you know what entities will be used and how they are defined, you can do the following (quite primitive and error-prone, but still better than nothing):

<xsl:stylesheet version="2.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:xs="http://www.w3.org/2001/XMLSchema"
 xmlns:my="my:my">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:character-map name="mapEntities">
  <xsl:output-character character="&amp;" string="&amp;"/>
 </xsl:character-map>

 <xsl:variable name="vEntities" select=
 "'stackoverflow',
 'How can I preserve the entity reference when transforming with XSLT\?\?'
 "/>

 <xsl:variable name="vReplacements" select=
 "'&amp;so;', '&amp;question;'"/>

 <xsl:template match="node()|@*">
  <xsl:copy>
   <xsl:apply-templates select="node()|@*"/>
  </xsl:copy>
 </xsl:template>

 <xsl:template match="/">
  <xsl:text disable-output-escaping="yes"><![CDATA[<!DOCTYPE doc [ <!ENTITY so "stackoverflow">
<!ENTITY question
"How can I preserve the entity reference when transforming with XSLT??"> ]>
]]>
  </xsl:text>

  <xsl:apply-templates/>
 </xsl:template>

 <xsl:template match="text()">
  <xsl:value-of select=
  "my:multiReplace(.,
                   $vEntities,
                   $vReplacements,
                   count($vEntities)
                   )
  " disable-output-escaping="yes"/>
 </xsl:template>

 <xsl:function name="my:multiReplace">
  <xsl:param name="pText" as="xs:string"/>
  <xsl:param name="pEnts" as="xs:string*"/>
  <xsl:param name="pReps" as="xs:string*"/>
  <xsl:param name="pCount" as="xs:integer"/>

  <xsl:sequence select=
  "if($pCount > 0)
     then
      my:multiReplace(replace($pText,
                              $pEnts[1],
                              $pReps[1]
                              ),
                      subsequence($pEnts,2),
                      subsequence($pReps,2),
                      $pCount -1
                      )
      else
       $pText
  "/>
 </xsl:function>
</xsl:stylesheet>

应用于提供的 XML 文档时:

<!DOCTYPE doc [ <!ENTITY so "stackoverflow">
<!ENTITY question
"How can I preserve the entity reference when transforming with XSLT??"> ]>
<doc>
    <text>Hello &so;!</text>
    <text>&question;</text>
</doc>

产生想要的结果:

<!DOCTYPE doc [ <!ENTITY so "stackoverflow">
<!ENTITY question
"How can I preserve the entity reference when transforming with XSLT??"> ]>

  <doc>
      <text>Hello &so;!</text>
      <text>&question;</text>
</doc>

请注意:

  1. 必须对替换中的特殊 (RegEx) 字符进行转义.

  1. The special (RegEx) characters in the replacements must be escaped.

我们需要解决 DOE,不推荐这样做,因为它违反了 XSLT 架构和处理模型的原则——换句话说,这个解决方案是一个令人讨厌的黑客.

We needed to resolve to DOE, which isn't recommended, because it violates the principles of the XSLT architecture and processing model -- in other words this solution is a nasty hack.

这篇关于使用 XSLT 转换 XML 时保留实体引用?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆