XSLT 忽略多个文件中的重复元素 [英] XSLT Ignore duplicate elements across multiple files

查看:24
本文介绍了XSLT 忽略多个文件中的重复元素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近问了一个关于如何忽略多个元素的问题,并在使用前导"和 Muenchian 方法方面得到了一些很好的回答.但是我想知道是否可以使用索引 xml 文件跨多个文件执行此操作.

I recently asked a question regarding how to ignore multiple elements, and got some good responses regarding using 'preceding' and the Muenchian Method. However I was wondering whether it is possible to do this across multiple files, with an index xml file.

索引.xml

<?xml-stylesheet type="text/xsl" href="merge2.xsl"?>
<list>
    <entry name="File1.xml" />
    <entry name="File2.xml" />
</list>

XML 文件示例

<Main>
    <Records>
        <Record>
            <Description>A</Description>
        </Record>
        <Record>
            <Description>A</Description>
        </Record>
        <Record>
            <Description>B</Description>
        </Record>
        <Record>
            <Description>C</Description>
        </Record>
    </Records>
    <Records>
        <Record>
            <Description>B</Description>
        </Record>
        <Record>
            <Description>A</Description>
        </Record>
        <Record>
            <Description>C</Description>
        </Record>
        <Record>
            <Description>C</Description>
        </Record>
    </Records>
</Main>

Merge2.xsl

  <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" >
  <xsl:output method="xml" indent="yes" />
  <xsl:key name="Record-by-Description" match="Record" use="Description"/>

  <xsl:template match="@* | node()">
    <xsl:apply-templates select="@* | node()"/>
  </xsl:template>

  <xsl:template match="Main">
    <table>
      <tr>
        <th>Type</th>
        <th>Count</th>
      </tr>
      <xsl:apply-templates select="Records"/>
    </table>
  </xsl:template>

  <xsl:template match="Records">
    <xsl:apply-templates select="Record[generate-id() = generate-id(key('Record-by-Description', Description)[1])]" mode="group"/>
  </xsl:template>

  <xsl:template match="Record" mode="group">
    <tr>
      <td>
        <xsl:value-of select="Description"/>
      </td>
      <td>
        <xsl:value-of select="count(key('Record-by-Description', Description))"/>
      </td>
    </tr>
  </xsl:template>

</xsl:stylesheet>

这在一个文件上运行良好,并为我提供了生成一张表的理想结果,仅显示唯一项目并添加计数.但是,在为多个文件遍历 index.xml 时,我一直无法产生所需的结果.

This works fine on one file, and gives me the desired result of producing one table, with unique items only being displayed and the count being added. However I have been unable to produce the desired result when going through the index.xml for multiple files.

我尝试使用针对 index.xml 的单独模板并将Main"模板应用于不同的 XML 文件,还尝试使用 for-each 循环浏览不同的文件.

I have tried using a seperate template targeting the index.xml and applying the 'Main' template to the different XML files, and also tried using a for-each to cycle through the different files.

在介绍 Muenchian 方法之前,我使用 for-each with 'preceding' 来检查重复节点,但是 'preceding' 似乎只搜索当前文档,并且无法找到有关使用它的信息多文档.

Before being introduced to the Muenchian Method I was using for-each with 'preceding' to check for duplicate nodes, however 'preceding' only seems to search back through the current document and have been unable to find information on using this across multi documents.

这两种方法中的任何一种都可以在多个文档中搜索重复的元素文本吗?

Is it possible with either of these methods to be able to search through multiple documents for duplicated element text?

非常感谢您的帮助.

推荐答案

键基本上是按文档构建的,因此基于直接键的 Muenchian 分组将不允许您识别和删除多个文档中的重复项.

Basically keys are built per document so a direct key based Muenchian grouping will not allow you to identify and remove duplicates in more than one document.

然而,您可以先将两个文档合并为一个,然后将 Muenchian 分组应用于合并的文档.

You could however first merge the two documents into one and then apply the Muenchian grouping to the merged document.

如果你想在一个样式表中合并和分组,你需要使用 exsl:node-set 或类似的:

If you want to merge and group in one stylesheet you need to use exsl:node-set or similar:

  <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
    xmlns:exsl="http://exslt.org/common" exclude-result-prefixes="exsl">

  <xsl:output method="xml" indent="yes" />
  <xsl:key name="Record-by-Description" match="Record" use="Description"/>

  <xsl:template match="/">
    <xsl:variable name="merged-rtf">
      <Main>
        <xsl:copy-of select="document(list/entry/@name)/Main/Records"/>
      </Main>
    </xsl:variable>
    <xsl:apply-templates select="exsl:node-set($merged-rtf)/Main"/>
   </xsl:template>

  <xsl:template match="@* | node()">
    <xsl:apply-templates select="@* | node()"/>
  </xsl:template>

  <xsl:template match="Main">
    <table>
      <tr>
        <th>Type</th>
        <th>Count</th>
      </tr>
      <xsl:apply-templates select="Records"/>
    </table>
  </xsl:template>

  <xsl:template match="Records">
    <xsl:apply-templates select="Record[generate-id() = generate-id(key('Record-by-Description', Description)[1])]" mode="group"/>
  </xsl:template>

  <xsl:template match="Record" mode="group">
    <tr>
      <td>
        <xsl:value-of select="Description"/>
      </td>
      <td>
        <xsl:value-of select="count(key('Record-by-Description', Description))"/>
      </td>
    </tr>
  </xsl:template>

</xsl:stylesheet>

您现在可以将 index.xml 作为主输入文档传递给样式表.

You would now pass your index.xml as the main input document to the stylesheet.

如果你想在 IE 浏览器中进行这种转换,那么你需要将 exsl:node-set 替换为 Microsoft 的 ms:node-set(使用适当的命名空间)或者您需要使用 http://dpcarlisle 中的方法.blogspot.de/2007/05/exslt-node-set-function.html 以确保实现了 exsl:node-set 功能.

If you want to do this transformation in the IE browser then you need to replace the exsl:node-set with Microsoft's ms:node-set (with the proper namespace) or you need to use the approach in http://dpcarlisle.blogspot.de/2007/05/exslt-node-set-function.html to make sure the exsl:node-set function is implemented.

这篇关于XSLT 忽略多个文件中的重复元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆