XSLT合并2个XML文件 [英] XSLT to Merge 2 XML Files
问题描述
我知道这里很少有与xml/xslt合并相关的问题,但是似乎没有一个可以解决我遇到的问题.
我正在寻找的是XSLT(尽可能通用-与输入XML文件的结构不紧密)
将a.xml与b.xml合并,并以如下方式生成c.xml
- c.xml将包含a.xml和b.xml之间的公共节点(带有该节点 取自a.xml的值)
- 此外,c.xml将包含b.xml中存在的节点(和值),而不是a.xml中存在的节点
例如:合并 a.xml :
<root_node>
<settings>
<setting1>a1</setting1>
<setting2>a2</setting2>
<setting3>
<setting31>a3</setting31>
</setting3>
<setting4>a4</setting4>
</settings>
</root_node>
使用 b.xml :
<root_node>
<settings>
<setting1>b1</setting1>
<setting2>b2</setting2>
<setting3>
<setting31>b3</setting31>
</setting3>
<setting5 id="77">b5</setting5>
</settings>
</root_node>
将生成 c.xml :
<root_node>
<settings>
<setting1>a1</setting1>
<setting2>a2</setting2>
<setting3>
<setting31>a3</setting31>
</setting3>
<setting5 id="77">b5</setting5>
</settings>
其他信息
我将尝试通过公共节点"来解释我的理解.这可能不是准确的xml/xslt定义 因为我不是任何专家.
a /root_node/settings/ setting1 是具有 b /root_node/settings/ setting1 的公共节点" strong>,因为使用相同路径到达了两个节点.设置2和设置3相同.
2个非公共节点"是 a /root_node/settings/ setting4 ,仅在a.xml中找到 (不应出现在输出中)和 b /root_node/settings/ setting5 (仅在b.xml中可以找到(应该进入输出)).>
通过通用解决方案",我并不是说某些东西可以用输入XML所具有的任何格式工作.我的意思是,xslt不应包含硬代码xpath,而您可能会添加诸如仅当a.xml中的节点是唯一的时,这才有效"之类的限制. 限制,您可能会认为这是合适的.
以下XSLT 1.0程序可以满足您的要求.
将其应用于b.xml
,并将路径作为参数传递到a.xml
.
这是它的工作方式.
- 它遍历
B
,因为它包含要保留的新节点以及A
和B
之间的公共元素.- 我将公共元素" 定义为具有相同简单路径的任何元素.
- 我将简单路径" 定义为以斜杠分隔的祖先元素名称和元素本身(即
ancestor-or-self
轴)名称的列表.
因此,在示例B
中,<setting31>
的简单路径为root_node/settings/setting3/setting31/
. - 请注意,此路径是不明确的.含义是,输入中不能有两个具有相同名称的元素共享同一父元素.根据您的样本,我认为情况并非如此.
- 对于每个叶子文本节点(元素中没有其他子元素的任何文本节点)
- 简单路径是使用名为
calculatePath
的模板计算的. - 调用递归模板
nodeValueByPath
,该模板尝试从另一个文档中检索相应简单路径的文本值. - 如果找到相应的文本节点,则使用其值.这满足了您的第一个要点.
- 如果未找到相应的节点,则使用手头的值,即
B
中的值.这满足了您的第二个要点.
- 简单路径是使用名为
结果,新文档与B
的结构匹配并包含:
-
B
中所有在A
中没有对应节点的文本节点值.
当 -
A
中的文本节点值.
B
中的对应节点存在时,这是XSLT:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes" />
<xsl:param name="aXmlPath" select="''" />
<xsl:param name="aDoc" select="document($aXmlPath)" />
<xsl:template match="@* | node()">
<xsl:copy>
<xsl:apply-templates select="@* | node()" />
</xsl:copy>
</xsl:template>
<!-- text nodes will be checked against doc A -->
<xsl:template match="*[not(*)]/text()">
<xsl:variable name="path">
<xsl:call-template name="calculatePath" />
</xsl:variable>
<xsl:variable name="valueFromA">
<xsl:call-template name="nodeValueByPath">
<xsl:with-param name="path" select="$path" />
<xsl:with-param name="context" select="$aDoc" />
</xsl:call-template>
</xsl:variable>
<xsl:choose>
<!-- either there is something at that path in doc A -->
<xsl:when test="starts-with($valueFromA, 'found:')">
<!-- remove prefix added in nodeValueByPath, see there -->
<xsl:value-of select="substring-after($valueFromA, 'found:')" />
</xsl:when>
<!-- or we take the value from doc B -->
<xsl:otherwise>
<xsl:value-of select="." />
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<!-- this calcluates a simpe path for a node -->
<xsl:template name="calculatePath">
<xsl:for-each select="..">
<xsl:call-template name="calculatePath" />
</xsl:for-each>
<xsl:if test="self::*">
<xsl:value-of select="concat(name(), '/')" />
</xsl:if>
</xsl:template>
<!-- this retrieves a node value by its simple path -->
<xsl:template name="nodeValueByPath">
<xsl:param name="path" select="''" />
<xsl:param name="context" select="''" />
<xsl:if test="contains($path, '/') and count($context)">
<xsl:variable name="elemName" select="substring-before($path, '/')" />
<xsl:variable name="nextPath" select="substring-after($path, '/')" />
<xsl:variable name="currContext" select="$context/*[name() = $elemName][1]" />
<xsl:if test="$currContext">
<xsl:choose>
<xsl:when test="contains($nextPath, '/')">
<xsl:call-template name="nodeValueByPath">
<xsl:with-param name="path" select="$nextPath" />
<xsl:with-param name="context" select="$currContext" />
</xsl:call-template>
</xsl:when>
<xsl:when test="not($currContext/*)">
<!-- always add a prefix so we can detect
the case "exists in A, but is empty" -->
<xsl:value-of select="concat('found:', $currContext/text())" />
</xsl:when>
</xsl:choose>
</xsl:if>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
I know there are few xml/xslt merge related questions here however none seems to solve the problem I have.
What I am looking is an XSLT (as generic as possible - not tight with the structure of the input XML files) which can
Merge a.xml with b.xml and generate c.xml such a way that
- c.xml will contain the common nodes between a.xml and b.xml (with the node values taken from a.xml)
- in addition c.xml will contain the nodes(and values) which are present in b.xml and not in a.xml
For example: merging a.xml:
<root_node>
<settings>
<setting1>a1</setting1>
<setting2>a2</setting2>
<setting3>
<setting31>a3</setting31>
</setting3>
<setting4>a4</setting4>
</settings>
</root_node>
with b.xml:
<root_node>
<settings>
<setting1>b1</setting1>
<setting2>b2</setting2>
<setting3>
<setting31>b3</setting31>
</setting3>
<setting5 id="77">b5</setting5>
</settings>
</root_node>
will generate c.xml:
<root_node>
<settings>
<setting1>a1</setting1>
<setting2>a2</setting2>
<setting3>
<setting31>a3</setting31>
</setting3>
<setting5 id="77">b5</setting5>
</settings>
Additional Information
I will try to explain what I understand by a "common node". This might not be an accurate xml/xslt definition since I am not an expert in any.
a/root_node/settings/setting1 is a "common node" with b/root_node/settings/setting1 since the 2 nodes are reached using the same path. The same for setting2 and setting3.
The 2 "non-common nodes" are a/root_node/settings/setting4 which is found only in a.xml (it should not come in the output) and b/root_node/settings/setting5 which is found only in b.xml (it should come into the output).
By "generic solution" I don't mean something that will work whatever format the input XMLs will have. What I mean by that is that the xslt should not contain hard-code xpaths while you might add restrictions like "this will work only if the nodes in a.xml are unique" or whatever other restriction you might think it will be suitable.
The following XSLT 1.0 program does what you want.
Apply it to b.xml
and pass in the path to a.xml
as a parameter.
Here is how it works.
- It traverses
B
, as that contains the new nodes that you want to keep as well as the common elements betweenA
andB
.- I define "common element" as any element that has the same simple path.
- I define "simple path" as the slash-delimited list of names of ancestor elements and the element itself, i.e. the
ancestor-or-self
axis.
So in your sampleB
,<setting31>
would have a simple path ofroot_node/settings/setting3/setting31/
. - Note that this path is ambiguous. The implication is that you cannot have any two elements with the same name that share the same parent in your input. Based on your samples I presume that will not be the case.
- For every leaf text node (any text node in an element with no further child elements)
- The simple path is calculated with a template called
calculatePath
. - The recursive template
nodeValueByPath
is called that tries to retrieve the text value of the corresponding simple path from the other document. - If a corresponding text node is found, its value is used. This satisfies your first bullet point.
- If no corresponding node is found, it uses the value at hand, i.e. the value from
B
. This satisfies your second bullet point.
- The simple path is calculated with a template called
As a result, the new document matches B
's structure and contains:
- all text node values from
B
that have no corresponding node inA
. - text node values from
A
when a corresponding node inB
exists.
Here's the XSLT:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes" />
<xsl:param name="aXmlPath" select="''" />
<xsl:param name="aDoc" select="document($aXmlPath)" />
<xsl:template match="@* | node()">
<xsl:copy>
<xsl:apply-templates select="@* | node()" />
</xsl:copy>
</xsl:template>
<!-- text nodes will be checked against doc A -->
<xsl:template match="*[not(*)]/text()">
<xsl:variable name="path">
<xsl:call-template name="calculatePath" />
</xsl:variable>
<xsl:variable name="valueFromA">
<xsl:call-template name="nodeValueByPath">
<xsl:with-param name="path" select="$path" />
<xsl:with-param name="context" select="$aDoc" />
</xsl:call-template>
</xsl:variable>
<xsl:choose>
<!-- either there is something at that path in doc A -->
<xsl:when test="starts-with($valueFromA, 'found:')">
<!-- remove prefix added in nodeValueByPath, see there -->
<xsl:value-of select="substring-after($valueFromA, 'found:')" />
</xsl:when>
<!-- or we take the value from doc B -->
<xsl:otherwise>
<xsl:value-of select="." />
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<!-- this calcluates a simpe path for a node -->
<xsl:template name="calculatePath">
<xsl:for-each select="..">
<xsl:call-template name="calculatePath" />
</xsl:for-each>
<xsl:if test="self::*">
<xsl:value-of select="concat(name(), '/')" />
</xsl:if>
</xsl:template>
<!-- this retrieves a node value by its simple path -->
<xsl:template name="nodeValueByPath">
<xsl:param name="path" select="''" />
<xsl:param name="context" select="''" />
<xsl:if test="contains($path, '/') and count($context)">
<xsl:variable name="elemName" select="substring-before($path, '/')" />
<xsl:variable name="nextPath" select="substring-after($path, '/')" />
<xsl:variable name="currContext" select="$context/*[name() = $elemName][1]" />
<xsl:if test="$currContext">
<xsl:choose>
<xsl:when test="contains($nextPath, '/')">
<xsl:call-template name="nodeValueByPath">
<xsl:with-param name="path" select="$nextPath" />
<xsl:with-param name="context" select="$currContext" />
</xsl:call-template>
</xsl:when>
<xsl:when test="not($currContext/*)">
<!-- always add a prefix so we can detect
the case "exists in A, but is empty" -->
<xsl:value-of select="concat('found:', $currContext/text())" />
</xsl:when>
</xsl:choose>
</xsl:if>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
这篇关于XSLT合并2个XML文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!