使用 XSLT 删除换行符和损坏的实体 [英] Removing line breaks and broken entities using XSLT

查看:31
本文介绍了使用 XSLT 删除换行符和损坏的实体的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的 XML 是从 Web 表单生成的,一些用户正在插入换行符和转换为换行符的字符 \n 和损坏的实体,如 &

我正在使用一些变量来转换和删除坏字符,但我不知道如何去除这些类型的字符.

这是我用来转换或去除其他坏字符的方法.如果您需要查看整个 XSL,请告诉我.……

<xsl:variable name="uppercase" select="'ABCDEFGHIJKLMNOPQRSTUVWXYZ äãêÂ.,'"/><xsl:variable name="linebreaks" select="'\n'"/><xsl:variable name="nolinebreaks" select="' '"/>

<xsl:value-of select="translate(normalize-space(Office_photos), $linebreaks, $nolinebreaks)"/>

XML 中的文本包含如下内容:

bn_1.jpg:表现出一点红袜队的骄傲!&#13;\n从左到右:Tessa Michelle Summers、\nJulie Gross、Alexis Drzewiecki</Office_photos>

我正在尝试去除数据中的 \n 字符

解决方案

正如 Lingamurthy CS 在评论中解释的那样 \n 在 XML 中不被视为单个字符.它只是简单地解析为两个字符,没有任何特殊处理.

如果这确实是您想要更改的,那么在 XSLT 1.0 中,您将需要使用递归模板来替换文本(XSLT 2.0 具有替换功能,XSLT 1.0 没有).

在 Stackoverflow 上进行快速搜索,在 XSLT 字符串替换 中找到了一个这样的模板>

调用这个,而不是这样做......

<xsl:value-of select="translate(normalize-space(Office_photos), $linebreaks, $nolinebreaks)"/>

你会这样做

 <xsl:with-param name="text" select="Office_photos"/><xsl:with-param name="replace" select="$linebreaks"/><xsl:with-param name="by" select="$nolinebreaks"/></xsl:call-template>

试试这个 XSLT

<xsl:output omit-xml-declaration="yes" indent="yes"/><xsl:variable name="linebreaks" select="'\n'"/><xsl:variable name="nolinebreaks" select="' '"/><xsl:template match="/"><xsl:call-template name="string-replace-all"><xsl:with-param name="text" select="Office_photos"/><xsl:with-param name="replace" select="$linebreaks"/><xsl:with-param name="by" select="$nolinebreaks"/></xsl:call-template></xsl:模板><xsl:template name="string-replace-all"><xsl:param name="text"/><xsl:param name="replace"/><xsl:param name="by"/><xsl:when test="contains($text, $replace)"><xsl:value-of select="substring-before($text,$replace)"/><xsl:value-of select="$by"/><xsl:call-template name="string-replace-all"><xsl:with-param name="text" select="substring-after($text,$replace)"/><xsl:with-param name="replace" select="$replace"/><xsl:with-param name="by" select="$by"/></xsl:call-template></xsl:when><xsl:否则><xsl:value-of select="$text"/></xsl:否则></xsl:选择></xsl:模板></xsl:stylesheet>

(感谢创建替换模板的 Mark Elliot)

My XML is being generated from a web form and some users are inserting line breaks and characters that being converted to line breaks \n and broken entities, like &amp;amp;

I'm using some variables to convert and remove bad characters, but I don't know how to strip out these types of characters.

Here's the method I'm using to convert or strip out other bad characters. Let me know if you need to see the entire XSL. …

<xsl:variable name="smallcase" select="'abcdefghijklmnopqrstuvwxyz_aaea'" />
<xsl:variable name="uppercase" select="'ABCDEFGHIJKLMNOPQRSTUVWXYZ äãêÂ.,'" />
<xsl:variable name="linebreaks" select="'\n'" />
<xsl:variable name="nolinebreaks" select="' '" />

<xsl:value-of select="translate(Surname, $uppercase, $smallcase)"/>
<xsl:value-of select="translate(normalize-space(Office_photos), $linebreaks, $nolinebreaks)"/>

The text in the XML contains content like this:

<Office_photos>bn_1.jpg: Showing a little Red Sox Pride!&#13;\nLeft to right: 
 Tessa Michelle Summers, \nJulie Gross, Alexis Drzewiecki</Office_photos>

I'm trying to get rid of the \n character inside the data

解决方案

As Lingamurthy CS explains in the comments \n is not treated as a single character in XML. It is simply parsed into two characters without any special handling.

If this is literally want you want to change though, then in XSLT 1.0 you will need to use a recursive template to replace the text (XSLT 2.0 has a replace function, XSLT 1.0 doesn't).

A quick search on Stackoverflow finds one such template at XSLT string replace

To call this, instead of doing this....

<xsl:value-of select="translate(normalize-space(Office_photos), $linebreaks, $nolinebreaks)"/>

You would just do this

  <xsl:call-template name="string-replace-all">
     <xsl:with-param name="text" select="Office_photos" />
     <xsl:with-param name="replace" select="$linebreaks" />
     <xsl:with-param name="by" select="$nolinebreaks" /> 
  </xsl:call-template>

Try this XSLT

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
   <xsl:output omit-xml-declaration="yes" indent="yes" />

   <xsl:variable name="linebreaks" select="'\n'" />
   <xsl:variable name="nolinebreaks" select="' '" />

   <xsl:template match="/">
      <xsl:call-template name="string-replace-all">
         <xsl:with-param name="text" select="Office_photos" />
         <xsl:with-param name="replace" select="$linebreaks" />
         <xsl:with-param name="by" select="$nolinebreaks" /> 
      </xsl:call-template>
   </xsl:template>

   <xsl:template name="string-replace-all">
     <xsl:param name="text" />
     <xsl:param name="replace" />
     <xsl:param name="by" />
     <xsl:choose>
       <xsl:when test="contains($text, $replace)">
         <xsl:value-of select="substring-before($text,$replace)" />
         <xsl:value-of select="$by" />
         <xsl:call-template name="string-replace-all">
           <xsl:with-param name="text" select="substring-after($text,$replace)" />
           <xsl:with-param name="replace" select="$replace" />
           <xsl:with-param name="by" select="$by" />
         </xsl:call-template>
       </xsl:when>
       <xsl:otherwise>
         <xsl:value-of select="$text" />
       </xsl:otherwise>
     </xsl:choose>
   </xsl:template>
</xsl:stylesheet>

(Credit to Mark Elliot who created the replace template)

这篇关于使用 XSLT 删除换行符和损坏的实体的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆