XSLT合并2个XML文件 [英] XSLT to Merge 2 XML Files

查看:55
本文介绍了XSLT合并2个XML文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道这里很少有与xml/xslt合并相关的问题,但是似乎没有一个可以解决我遇到的问题.

我正在寻找的是XSLT(尽可能通用-与输入XML文件的结构不紧密)

将a.xml与b.xml合并,并以如下方式生成c.xml

  • c.xml将包含a.xml和b.xml之间的公共节点(带有该节点 取自a.xml的值)
  • 此外,c.xml将包含b.xml中存在的节点(和值),而不是a.xml中存在的节点

例如:合并 a.xml :

<root_node>
  <settings>
    <setting1>a1</setting1>
    <setting2>a2</setting2>
    <setting3>
      <setting31>a3</setting31>
    </setting3>
    <setting4>a4</setting4>
  </settings>
</root_node>

使用 b.xml :

<root_node>
  <settings>
    <setting1>b1</setting1>
    <setting2>b2</setting2>
    <setting3>
      <setting31>b3</setting31>
    </setting3>
    <setting5 id="77">b5</setting5>
  </settings>
</root_node>

将生成 c.xml :

<root_node>
  <settings>
  <setting1>a1</setting1>
  <setting2>a2</setting2>
  <setting3>
    <setting31>a3</setting31>
  </setting3>
  <setting5 id="77">b5</setting5>
</settings>

其他信息

我将尝试通过公共节点"来解释我的理解.这可能不是准确的xml/xslt定义 因为我不是任何专家.

a /root_node/settings/ setting1 是具有 b /root_node/settings/ setting1 的公共节点" strong>,因为使用相同路径到达了两个节点.设置2和设置3相同.

2个非公共节点"是 a /root_node/settings/ setting4 ,仅在a.xml中找到 (不应出现在输出中)和 b /root_node/settings/ setting5 (仅在b.xml中可以找到(应该进入输出)).

通过通用解决方案",我并不是说某些东西可以用输入XML所具有的任何格式工作.我的意思是,xslt不应包含硬代码xpath,而您可能会添加诸如仅当a.xml中的节点是唯一的时,这才有效"之类的限制. 限制,您可能会认为这是合适的.

解决方案

以下XSLT 1.0程序可以满足您的要求.

将其应用于b.xml,并将路径作为参数传递到a.xml.

这是它的工作方式.

  1. 它遍历B,因为它包含要保留的新节点以及AB之间的公共元素.

    1. 我将公共元素" 定义为具有相同简单路径的任何元素.
    2. 我将简单路径" 定义为以斜杠分隔的祖先元素名称和元素本身(即ancestor-or-self轴)名称的列表.
      因此,在示例B中,<setting31>简单路径root_node/settings/setting3/setting31/.
    3. 请注意,此路径是不明确的.含义是,输入中不能有两个具有相同名称的元素共享同一父元素.根据您的样本,我认为情况并非如此.

  2. 对于每个叶子文本节点(元素中没有其他子元素的任何文本节点)

    1. 简单路径是使用名为calculatePath的模板计算的.
    2. 调用递归模板nodeValueByPath,该模板尝试从另一个文档中检索相应简单路径的文本值.
    3. 如果找到相应的文本节点,则使用其值.这满足了您的第一个要点.
    4. 如果未找到相应的节点,则使用手头的值,即B中的值.这满足了您的第二个要点.

结果,新文档与B的结构匹配并包含:

  • B中所有在A中没有对应节点的文本节点值.
  • B中的对应节点存在时,
  • A中的文本节点值.

这是XSLT:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="xml" indent="yes" />

  <xsl:param name="aXmlPath" select="''" />
  <xsl:param name="aDoc"     select="document($aXmlPath)" />

  <xsl:template match="@* | node()">
    <xsl:copy>
       <xsl:apply-templates select="@* | node()" />
    </xsl:copy>
  </xsl:template>

  <!-- text nodes will be checked against doc A -->
  <xsl:template match="*[not(*)]/text()">
    <xsl:variable name="path">
      <xsl:call-template name="calculatePath" />
    </xsl:variable>

    <xsl:variable name="valueFromA">
      <xsl:call-template name="nodeValueByPath">
        <xsl:with-param name="path"    select="$path" />
        <xsl:with-param name="context" select="$aDoc" />
      </xsl:call-template>
    </xsl:variable>

    <xsl:choose>
      <!-- either there is something at that path in doc A -->
      <xsl:when test="starts-with($valueFromA, 'found:')">
        <!-- remove prefix added in nodeValueByPath, see there --> 
        <xsl:value-of select="substring-after($valueFromA, 'found:')" />
      </xsl:when>
      <!-- or we take the value from doc B -->
      <xsl:otherwise>
        <xsl:value-of select="." />
      </xsl:otherwise>
    </xsl:choose>
  </xsl:template>

  <!-- this calcluates a simpe path for a node -->
  <xsl:template name="calculatePath">
    <xsl:for-each select="..">
      <xsl:call-template name="calculatePath" />
    </xsl:for-each>
    <xsl:if test="self::*">
      <xsl:value-of select="concat(name(), '/')" />
    </xsl:if>
  </xsl:template>

  <!-- this retrieves a node value by its simple path -->
  <xsl:template name="nodeValueByPath">
    <xsl:param name="path"    select="''" />
    <xsl:param name="context" select="''" />

    <xsl:if test="contains($path, '/') and count($context)">
      <xsl:variable name="elemName" select="substring-before($path, '/')" />
      <xsl:variable name="nextPath" select="substring-after($path, '/')" />
      <xsl:variable name="currContext" select="$context/*[name() = $elemName][1]" />

      <xsl:if test="$currContext">
        <xsl:choose>
          <xsl:when test="contains($nextPath, '/')">
            <xsl:call-template name="nodeValueByPath">
              <xsl:with-param name="path"    select="$nextPath" />
              <xsl:with-param name="context" select="$currContext" />
            </xsl:call-template>
          </xsl:when>
          <xsl:when test="not($currContext/*)">
            <!-- always add a prefix so we can detect 
                 the case "exists in A, but is empty" -->
            <xsl:value-of select="concat('found:', $currContext/text())" />
          </xsl:when>
        </xsl:choose>
      </xsl:if>
    </xsl:if>    
  </xsl:template>
</xsl:stylesheet>

I know there are few xml/xslt merge related questions here however none seems to solve the problem I have.

What I am looking is an XSLT (as generic as possible - not tight with the structure of the input XML files) which can

Merge a.xml with b.xml and generate c.xml such a way that

  • c.xml will contain the common nodes between a.xml and b.xml (with the node values taken from a.xml)
  • in addition c.xml will contain the nodes(and values) which are present in b.xml and not in a.xml

For example: merging a.xml:

<root_node>
  <settings>
    <setting1>a1</setting1>
    <setting2>a2</setting2>
    <setting3>
      <setting31>a3</setting31>
    </setting3>
    <setting4>a4</setting4>
  </settings>
</root_node>

with b.xml:

<root_node>
  <settings>
    <setting1>b1</setting1>
    <setting2>b2</setting2>
    <setting3>
      <setting31>b3</setting31>
    </setting3>
    <setting5 id="77">b5</setting5>
  </settings>
</root_node>

will generate c.xml:

<root_node>
  <settings>
  <setting1>a1</setting1>
  <setting2>a2</setting2>
  <setting3>
    <setting31>a3</setting31>
  </setting3>
  <setting5 id="77">b5</setting5>
</settings>

Additional Information

I will try to explain what I understand by a "common node". This might not be an accurate xml/xslt definition since I am not an expert in any.

a/root_node/settings/setting1 is a "common node" with b/root_node/settings/setting1 since the 2 nodes are reached using the same path. The same for setting2 and setting3.

The 2 "non-common nodes" are a/root_node/settings/setting4 which is found only in a.xml (it should not come in the output) and b/root_node/settings/setting5 which is found only in b.xml (it should come into the output).

By "generic solution" I don't mean something that will work whatever format the input XMLs will have. What I mean by that is that the xslt should not contain hard-code xpaths while you might add restrictions like "this will work only if the nodes in a.xml are unique" or whatever other restriction you might think it will be suitable.

解决方案

The following XSLT 1.0 program does what you want.

Apply it to b.xml and pass in the path to a.xml as a parameter.

Here is how it works.

  1. It traverses B, as that contains the new nodes that you want to keep as well as the common elements between A and B.

    1. I define "common element" as any element that has the same simple path.
    2. I define "simple path" as the slash-delimited list of names of ancestor elements and the element itself, i.e. the ancestor-or-self axis.
      So in your sample B, <setting31> would have a simple path of root_node/settings/setting3/setting31/.
    3. Note that this path is ambiguous. The implication is that you cannot have any two elements with the same name that share the same parent in your input. Based on your samples I presume that will not be the case.

  2. For every leaf text node (any text node in an element with no further child elements)

    1. The simple path is calculated with a template called calculatePath.
    2. The recursive template nodeValueByPath is called that tries to retrieve the text value of the corresponding simple path from the other document.
    3. If a corresponding text node is found, its value is used. This satisfies your first bullet point.
    4. If no corresponding node is found, it uses the value at hand, i.e. the value from B. This satisfies your second bullet point.

As a result, the new document matches B's structure and contains:

  • all text node values from B that have no corresponding node in A.
  • text node values from A when a corresponding node in B exists.

Here's the XSLT:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="xml" indent="yes" />

  <xsl:param name="aXmlPath" select="''" />
  <xsl:param name="aDoc"     select="document($aXmlPath)" />

  <xsl:template match="@* | node()">
    <xsl:copy>
       <xsl:apply-templates select="@* | node()" />
    </xsl:copy>
  </xsl:template>

  <!-- text nodes will be checked against doc A -->
  <xsl:template match="*[not(*)]/text()">
    <xsl:variable name="path">
      <xsl:call-template name="calculatePath" />
    </xsl:variable>

    <xsl:variable name="valueFromA">
      <xsl:call-template name="nodeValueByPath">
        <xsl:with-param name="path"    select="$path" />
        <xsl:with-param name="context" select="$aDoc" />
      </xsl:call-template>
    </xsl:variable>

    <xsl:choose>
      <!-- either there is something at that path in doc A -->
      <xsl:when test="starts-with($valueFromA, 'found:')">
        <!-- remove prefix added in nodeValueByPath, see there --> 
        <xsl:value-of select="substring-after($valueFromA, 'found:')" />
      </xsl:when>
      <!-- or we take the value from doc B -->
      <xsl:otherwise>
        <xsl:value-of select="." />
      </xsl:otherwise>
    </xsl:choose>
  </xsl:template>

  <!-- this calcluates a simpe path for a node -->
  <xsl:template name="calculatePath">
    <xsl:for-each select="..">
      <xsl:call-template name="calculatePath" />
    </xsl:for-each>
    <xsl:if test="self::*">
      <xsl:value-of select="concat(name(), '/')" />
    </xsl:if>
  </xsl:template>

  <!-- this retrieves a node value by its simple path -->
  <xsl:template name="nodeValueByPath">
    <xsl:param name="path"    select="''" />
    <xsl:param name="context" select="''" />

    <xsl:if test="contains($path, '/') and count($context)">
      <xsl:variable name="elemName" select="substring-before($path, '/')" />
      <xsl:variable name="nextPath" select="substring-after($path, '/')" />
      <xsl:variable name="currContext" select="$context/*[name() = $elemName][1]" />

      <xsl:if test="$currContext">
        <xsl:choose>
          <xsl:when test="contains($nextPath, '/')">
            <xsl:call-template name="nodeValueByPath">
              <xsl:with-param name="path"    select="$nextPath" />
              <xsl:with-param name="context" select="$currContext" />
            </xsl:call-template>
          </xsl:when>
          <xsl:when test="not($currContext/*)">
            <!-- always add a prefix so we can detect 
                 the case "exists in A, but is empty" -->
            <xsl:value-of select="concat('found:', $currContext/text())" />
          </xsl:when>
        </xsl:choose>
      </xsl:if>
    </xsl:if>    
  </xsl:template>
</xsl:stylesheet>

这篇关于XSLT合并2个XML文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆