XSLT - 如何只保留 XML 中需要的元素 [英] XSLT - How to keep only wanted elements from XML

查看:25
本文介绍了XSLT - 如何只保留 XML 中需要的元素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有许多包含大量开销的 XML 文件.我希望只保留大约 20 个特定元素并过滤掉其他任何元素.我知道我想要保留的元素的所有名称,我也知道它们是否是子元素以及它们的父元素.我想在转换后保留的这些元素需要仍然有它们原来的层次结构.

I have a number of XML files containing lots of overhead. I wish to keep only about 20 specific elements and filter out anything else. I know all the names of the elements I want to keep, I also know whether or not they are child elements and who are their parents. These elements that I want to keep after the transformation need to still have their original hierarchic placement.

例如我只想保留

在;

<ns:stuff>
 <ns:things>
  <ns:currency>somecurrency</ns:currency>
  <ns:currency_code/>
  <ns:currency_code2/>
  <ns:currency_code3/>
  <ns:currency_code4/>
 </ns:things>
</ns:stuff>

让它看起来像这样;

<ns:stuff>
 <ns:things>
  <ns:currency>somecurrency</ns:currency>
 </ns:things>
</ns:stuff>

构建 XSLT 以实现此目的的最佳方法是什么?

What would be the best way of constructing an XSLT to accomplish this?

推荐答案

这种通用转换:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:ns="some:ns">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <ns:WhiteList>
  <name>ns:currency</name>
  <name>ns:currency_code3</name>
 </ns:WhiteList>

 <xsl:template match="node()|@*">
     <xsl:copy>
       <xsl:apply-templates select="node()|@*"/>
     </xsl:copy>
 </xsl:template>

 <xsl:template match=
  "*[not(descendant-or-self::*[name()=document('')/*/ns:WhiteList/*])]"/>
</xsl:stylesheet>

应用于提供的 XML 文档时(添加命名空间定义以使其格式良好):

when applied on the provided XML document (with namespace definition added to make it well-formed):

<ns:stuff xmlns:ns="some:ns">
    <ns:things>
        <ns:currency>somecurrency</ns:currency>
        <ns:currency_code/>
        <ns:currency_code2/>
        <ns:currency_code3/>
        <ns:currency_code4/>
    </ns:things>
</ns:stuff>

产生想要的结果(保留白名单元素及其结构关系):

produces the wanted result (white-listed elements and their structural relations are preserved):

<ns:stuff xmlns:ns="some:ns">
   <ns:things>
      <ns:currency>somecurrency</ns:currency>
      <ns:currency_code3/>
   </ns:things>
</ns:stuff>

说明:

  1. 身份规则/模板按原样"复制所有节点.

  1. The identity rule/template copies all nodes "as-is".

样式表包含一个顶级 元素,它的 子元素指定了所有列入白名单的元素的名称——要保留的元素及其在文档中的结构关系.

The stylesheet contains a top-level <ns:WhiteList> element whose <name> children specify all white-listed element's names -- the elements that are to be preserved with their structural relationships in the document.

<ns:WhiteList> 元素最好保存在单独的文档中,这样当前样式表就不需要用新名称进行编辑.这里白名单在同一个样式表中只是为了方便.

The <ns:WhiteList> element is best kept in a separate document so that the current stylesheet will not need to be edited with new names. Here the whitelist is in the same stylesheet just for convenience.

一个模板覆盖了身份模板.它不会处理(删除)任何未列入白名单且没有列入白名单的后代的元素.

One single template is overriding the identity template. It doesn't process (deletes) any element that is not white-listed and has no descendent that is white-listed.

这篇关于XSLT - 如何只保留 XML 中需要的元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆