如何对没有唯一字段/元素排序的大型xml列表进行排序 [英] How to sort big xml list with no unique field/element to sort by

查看:108
本文介绍了如何对没有唯一字段/元素排序的大型xml列表进行排序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个非常大的xml可以排序,我试图验证两个XML列表之间没有差异,但是我所有的"Diff"应用程序都显示出很大的差异,即使我知道98%的信息都包含在这两个列表中列表.

I have a really big xml to sort, i am trying to validate that there are no differences between two XML lists but all my "Diff" applications show a lot of differences even though i know 98% of the information is in both lists.

我尝试了一种通过一个或多个元素对XML进行排序的各种方法,因此它们的排序方式相同,但是没有运气,因为可以说这两个xml对于每个行"都没有唯一的值.有一个电子邮件"字段,但有时电子邮件标签完全丢失了,因此并不是一个很好的排序依据.

i have tried some various ways of sorting the XML by one or multiple elements, so that they are ordered the same way but with no luck because both the xml's doesn't have a unique value for each "Row" so to speak. There is an Email field but sometimes the email tag is missing completely, which doesn't make it a good field to sort by.

看起来像这样:

<Customer>
  <row CompanyID="1" Name="John" Email="John@mail.com" \>
  <row CompanyID="1" Name="Jane" Email="Jane@mail.com" \>
  <row CompanyID="1" Name="Howard" Email="Howard@mail.com" \>
  <row CompanyID="2" Name="Jen" Email="Jen@mail.com" \>
  <row CompanyID="2" Name="James" Email="James@mail.com" \>
  <row CompanyID="3" Name="Phil" Email="Phil@mail.com" \>
  <row CompanyID="3" Name="Kenny" \>
  <row CompanyID="3" Name="Andrew" Email="Andrew@mail.com" \>
  <row CompanyID="3" Name="Greg" Email="Greg@mail.com" \>
  <row CompanyID="4" Name="Julia" Email="Julia@mail.com" \>
  <row CompanyID="4" Name="Hannah" Email="Hannah@mail.com" \>
  <row CompanyID="4" Name="Riley" Email="" \>
  <row CompanyID="4" Name="Anders" Email="Anders@mail.com" \>
</Customer>

(仅用于显示目的的XML)

(XML only for showing purpose)

有什么好的方法可以解决这个问题?

Is there any good ways to solve this problem?

我需要的是对它们进行排序的一种很好的方法,或者是一种比较应用程序,该应用程序具有比较xml的技术,可以说这种方法不考虑对象顺序.

what i need is either a good way of sorting both of them or a comparing application that has the technology to compare xml's not taking object order to account so to speak.

推荐答案

使用Microsoft XML Diff

Use Microsoft XML Diff https://msdn.microsoft.com/en-us/library/aa302294.aspx

public void GenerateDiffGram(string originalFile, string finalFile, XmlWriter diffGramWriter)
{
   XmlDiff xmldiff = new XmlDiff(XmlDiffOptions.IgnoreChildOrder |
       XmlDiffOptions.IgnoreNamespaces | XmlDiffOptions.IgnorePrefixes);

   bool bIdentical = xmldiff.Compare(originalFile, finalFile, false, diffGramWriter);
   diffgramWriter.Close();
}

如果两个文件相同,则Compare()方法返回true,否则返回false. 最后一个参数diffgramWriter是比较输出的写入位置.生成的输出是一个XML文档,该文档记录了两个文件之间的差异.在这种情况下是这样的:

The Compare() method returns true if the two files are identical, and false otherwise. The last argument, diffgramWriter, is where the output of the comparison is written. The output generated is an XML document that records the differences between the two files. Here is what it looks like in this scenario:

public void CompareXml(string file1, string file2, string diffFileNameWithPath)
        {

            XmlReader reader1 = XmlReader.Create(new StringReader(file1));
            XmlReader reader2 = XmlReader.Create(new StringReader(file2));

            StringBuilder differenceStringBuilder = new StringBuilder();

            using (FileStream fs = new FileStream(diffFileNameWithPath, FileMode.Create))
            {
                XmlWriter diffGramWriter = XmlWriter.Create(fs);

                XmlDiff xmldiff = new XmlDiff(XmlDiffOptions.IgnoreChildOrder |
                                        XmlDiffOptions.IgnoreNamespaces |
                                        XmlDiffOptions.IgnorePrefixes);
                bool bIdentical = xmldiff.Compare(file1, file2, false, diffGramWriter);

                diffGramWriter.Close();
            }
        }

这篇关于如何对没有唯一字段/元素排序的大型xml列表进行排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆