使用linq合并具有相同结构的多个XML文件并基于密钥删除重复项 [英] Using linq to merge multiple XML files with the same structure and removing duplicates based on a key
问题描述
我有多个XML文件,我正在尝试将其合并为一个文件. Linq to XML可能是最好的选择,但我愿意接受想法(XSLT似乎擅长合并两个文件,但笨拙,其中n> 2或n =大).
I have multiple XML files I'm trying to merge into a single file. Linq to XML is probably the best option but I'm open to ideas (XSLT seems good at merging TWO files but is clumsy where n > 2 or n = big).
从这里阅读其他问题,某种形式的联接看起来不错.
From reading other questions here, some sort of join looks good.
File1.xml:
File1.xml:
<first>
<second>
<third id="Id1">
<values>
<value a="1" b="one"/>
<value a="2" b="two"/>
<value a="3" b="three"/>
</values>
</third>
<third id="Id2">
<values>
<value a="f" b="foo"/>
<value a="b" b="bar"/>
<value a="w" b="wibble"/>
</values>
</third>
</second>
</first>
File2.xml:
File2.xml:
<first>
<second>
<third id="Id1">
<values>
<value a="2" b="two"/>
<value a="3" b="three"/>
<value a="6" b="six"/>
</values>
</third>
<third id="Id3">
<values>
<value a="x" b="ex"/>
<value a="y" b="why"/>
<value a="z" b="zed"/>
</values>
</third>
</second>
</first>
Merged.xml:
Merged.xml:
<first>
<second>
<third id="Id1">
<values>
<value a="1" b="one"/>
<value a="2" b="two"/>
<value a="3" b="three"/>
<value a="6" b="six"/>
</values>
</third>
<third id="Id2">
<values>
<value a="f" b="foo"/>
<value a="b" b="bar"/>
<value a="w" b="wibble"/>
</values>
</third>
<third id="Id3">
<values>
<value a="x" b="ex"/>
<value a="y" b="why"/>
<value a="z" b="zed"/>
</values>
</third>
</second>
</first>
即它将基于third/@ id属性合并值.
i.e. it merges the values based on the third/@id attribute.
我如何用linq优雅地做到这一点?
How do I do this elegantly with linq?
推荐答案
下面的内容仍然很丑陋,我相信可以通过一些工作使其变得更加精简,但是现在看来做这份工作:
The below is still quite ugly, and I am sure it could be brought into a somewhat more streamlined shape with a bit of work, but for now this seems to do the job:
public static void MergeXml()
{
var xdoc1 = XDocument.Load(@"c:\temp\test.xml");
var xdoc2 = XDocument.Load(@"c:\temp\test2.xml");
var d1Targets = xdoc1.Descendants("third");
var d2Selection = xdoc2.Descendants("third").ToList();
Func<XElement, XElement, string, bool> attributeMatches = (x, y, a) =>
x.Attribute(a).Value == y.Attribute(a).Value;
Func<IEnumerable<XElement>, XElement, bool> hasMatchingValue = (ys, x) =>
// remove && if matching "a" should cause replacement.
ys.Any(d => attributeMatches(d, x, "a") && attributeMatches(d, x, "b"));
foreach (var e in d1Targets)
{
var fromD2 = d2Selection.Find(x => attributeMatches(x, e, "id"));
if (fromD2 != null)
{
d2Selection.Remove(fromD2);
var dest = e.Descendants("value");
dest.LastOrDefault()
.AddAfterSelf(fromD2.Descendants("value").Where(x => !hasMatchingValue(dest, x)));
}
};
if (d2Selection.Count > 0)
d1Targets.LastOrDefault().AddAfterSelf(d2Selection);
xdoc1.Save(@"c:\temp\merged.xml");
}
这将从OPs问题中的两个示例输入文件生成以下输出文件:
This produces the following output file from the two example input files in OPs question:
<?xml version="1.0" encoding="utf-8"?>
<first>
<second>
<third id="Id1">
<values>
<value a="1" b="one" />
<value a="2" b="two" />
<value a="3" b="three" />
<value a="6" b="six" />
</values>
</third>
<third id="Id2">
<values>
<value a="f" b="foo" />
<value a="b" b="bar" />
<value a="w" b="wibble" />
</values>
</third>
<third id="Id3">
<values>
<value a="x" b="ex" />
<value a="y" b="why" />
<value a="z" b="zed" />
</values>
</third>
</second>
</first>
这篇关于使用linq合并具有相同结构的多个XML文件并基于密钥删除重复项的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!