如何处理XSL中的非法HTML字符 [英] How to handle the illegal HTML characters in XSL

查看:98
本文介绍了如何处理XSL中的非法HTML字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个存储数据的XML文件。我正在使用XSL从该XML文件生成HTML文件。当我尝试这样做时,出现错误非法HTML字符:小数点150



我不允许更改XML文件。我必须在XSL中将这一个和许多其他非法字符映射为合法字符(可以是任何)。所以它必须以通用的方式进行映射,不仅适用于某种类型的字符。 您可以定义一个字符映射将不允许的字符映射到允许的字符,例如空格:

 < xsl:output indent =yes方法=htmluse-character-maps =m1/> 

< xsl:character-map name =m1>
< / xsl:character-map>根据 http://www.w3.org/TR/xslt-xquery-serialization/#HTML_CHARDATA 这些是控制字符#x7F-#x9F所以使用

 < xsl:template match =text()> 
< xsl:value-of select =replace(。,'[&#x007F; - &#x009F;]','')/>
< / xsl:template>

应确保输入文档中文本节点中的那些字符被替换为空格。



作为另一种选择,您可以考虑使用XHTML命名空间中的元素以及输出方法 xhtml 输出XHTML。



根据字符列表,将所有非法控制字符映射到空间的完整字符映射是

 < xsl:character-map 
name =no-control-characters>
< / xsl:character-map>

我用XSLT 2.0和Saxon生成了这个列表,使用

 < xsl:stylesheet version =2.0xmlns:xsl =http://www.w3.org/1999/XSL/Transform
xmlns :xs =http://www.w3.org/2001/XMLSchema
xmlns:axsl =http://www.w3.org/1999/XSL/TransformAlias
exclude-result -prefixes =xs axsl>

< xsl:param name =endas =xs:integerselect =159/>

< xsl:param name =replacementas =xs:stringselect =''/>

< xsl:namespace-alias stylesheet-prefix =axslresult-prefix =xsl/>


< xsl:character-map name =character-reference>
< / xsl:character-map>

< xsl:template name =main>
< axsl:character-map name =no-control-characters>
< xsl:for-each select =$ start to $ end>
< axsl:输出字符字符=«#{。};字符串= {$替换}/>
< / xsl:for-each>
< / axsl:character-map>
< / xsl:template>

< / xsl:stylesheet>


I have an XML file which stores data. I am using an XSL to generate HTML files from that XML file. When I try to do that I get the error Illegal HTML character: decimal 150

I am not allowed to change the XML file. I have to map that one and many other illegal characters to a legal character (it can be any) in XSL. So it has to do that mapping in a generic way not only for one type of character.

解决方案

You can define a character map that maps the characters not allowed to one allowed, for instance a space:

<xsl:output indent="yes" method="html" use-character-maps="m1"/>

<xsl:character-map name="m1">
  <xsl:output-character character="&#150;" string=" "/>
</xsl:character-map>

As an alternative, use a template replacing all illegal characters, according to http://www.w3.org/TR/xslt-xquery-serialization/#HTML_CHARDATA these are control characters #x7F-#x9F so using

<xsl:template match="text()">
  <xsl:value-of select="replace(., '[&#x007F;-&#x009F;]', ' ')"/>
</xsl:template>

should make sure those characters in text nodes in the input document are replaced by a spaces.

As another alternative, you could consider to output XHTML with elements in the XHTML namespaces and output method xhtml.

Based on the list of characters, a full character map mapping all illegal control characters to a space is

<xsl:character-map
                   name="no-control-characters">
   <xsl:output-character character="&#127;" string=" "/>
   <xsl:output-character character="&#128;" string=" "/>
   <xsl:output-character character="&#129;" string=" "/>
   <xsl:output-character character="&#130;" string=" "/>
   <xsl:output-character character="&#131;" string=" "/>
   <xsl:output-character character="&#132;" string=" "/>
   <xsl:output-character character="&#133;" string=" "/>
   <xsl:output-character character="&#134;" string=" "/>
   <xsl:output-character character="&#135;" string=" "/>
   <xsl:output-character character="&#136;" string=" "/>
   <xsl:output-character character="&#137;" string=" "/>
   <xsl:output-character character="&#138;" string=" "/>
   <xsl:output-character character="&#139;" string=" "/>
   <xsl:output-character character="&#140;" string=" "/>
   <xsl:output-character character="&#141;" string=" "/>
   <xsl:output-character character="&#142;" string=" "/>
   <xsl:output-character character="&#143;" string=" "/>
   <xsl:output-character character="&#144;" string=" "/>
   <xsl:output-character character="&#145;" string=" "/>
   <xsl:output-character character="&#146;" string=" "/>
   <xsl:output-character character="&#147;" string=" "/>
   <xsl:output-character character="&#148;" string=" "/>
   <xsl:output-character character="&#149;" string=" "/>
   <xsl:output-character character="&#150;" string=" "/>
   <xsl:output-character character="&#151;" string=" "/>
   <xsl:output-character character="&#152;" string=" "/>
   <xsl:output-character character="&#153;" string=" "/>
   <xsl:output-character character="&#154;" string=" "/>
   <xsl:output-character character="&#155;" string=" "/>
   <xsl:output-character character="&#156;" string=" "/>
   <xsl:output-character character="&#157;" string=" "/>
   <xsl:output-character character="&#158;" string=" "/>
   <xsl:output-character character="&#159;" string=" "/>
</xsl:character-map>

I generated that list with XSLT 2.0 and Saxon, using

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:xs="http://www.w3.org/2001/XMLSchema"
  xmlns:axsl="http://www.w3.org/1999/XSL/TransformAlias"
  exclude-result-prefixes="xs axsl">

<xsl:param name="start" as="xs:integer" select="127"/>
<xsl:param name="end" as="xs:integer" select="159"/>

<xsl:param name="replacement" as="xs:string" select="' '"/>

<xsl:namespace-alias stylesheet-prefix="axsl" result-prefix="xsl"/>

<xsl:output method="xml" indent="yes" use-character-maps="character-reference"/>

<xsl:character-map name="character-reference">
  <xsl:output-character character="«" string="&amp;"/>
</xsl:character-map>

<xsl:template name="main">
  <axsl:character-map name="no-control-characters">
    <xsl:for-each select="$start to $end">
      <axsl:output-character character="«#{.};" string="{$replacement}"/>
    </xsl:for-each>
  </axsl:character-map>
</xsl:template>

</xsl:stylesheet>

这篇关于如何处理XSL中的非法HTML字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆