C#XML差异算法 [英] C# XML Diffing algorithm

查看:55
本文介绍了C#XML差异算法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在用户编辑它们之前和之后,我有两个XML。我需要检查用户是否仅添加了新元素,而没有删除或更改旧元素。

I have two XML, before and after the user has edited them. I need to check that user have only added new elements but have not deleted or changed old ones.

有人可以建议我使用一种很好的算法进行这种比较吗?

Can anybody suggest to me a good algorithm to do that comparison?

Ps:
我的XML具有非常琐碎的架构,它们仅以幼稚的方式表示对象的结构(带有嵌套对象)。
允许的标签很少,< object>标签只能包含< name>标签,< type>标签或< list>标签。
< name>和< type>标签只能包含一个字符串; < list>标记可以包含< name>标签和单个< object>标签(代表列表中对象的结构)。
< name>中的字符串。标记可以自由选择,< type>中的字符串标记只能是 string, int, float, bool, date或 composite。

Ps: My XML has a very trivial schema, they only represent an object's structure (with nested objects) in a naive way. There are few allowed tags, <object> tag can only contains <name> tag, <type> tag or a <list> tag. The <name> and <type> tag can only contain a string; <list> tag instead can contain a <name> tag and a single <object> tags (representing the structure of objects in the list). The string in the <name> tag can be freely choosen, the string in <type> tag instead can be only "string" , "int" , "float" , "bool" , "date" or "composite".

下面是一个示例:

 <object>
      <name>Person</name>
      <type>composite</type>

      <object>
            <name>Person_Name</name>
            <type>string</type>
      </object>

      <object>
            <name>Person_Surname</name>
            <type>string</type>
      </object>

      <object>
            <name>Person_Age</name>
            <type>int</type>
      </object>

      <object>
            <name>Person_Weight</name>
            <type>float</type>
      </object>

      <object>
            <name>Person_Address</name>
            <type>string</type>
      </object>

      <object>
            <name>Person_BirthDate</name>
            <type>date</type>
      </object>

      <list>
            <name>Person_PhoneNumbers</name>

            <object>
                  <name>Person_PhoneNumber</name>
                  <type>composite</type>

                  <object>
                        <name>Person_PhoneNumber_ProfileName</name>
                        <type>string</type>
                  </object>
                  <object>
                        <name>Person_PhoneNumber_CellNumber</name>
                        <type>string</type>
                  </object>
                  <object>
                        <name>Person_PhoneNumber_HomeNumber</name>
                        <type>string</type>
                  </object>
                  <object>
                        <name>Person_PhoneNumber_FaxNumber</name>
                        <type>string</type>
                  </object>
                  <object>
                        <name>Person_PhoneNumber_Mail</name>
                        <type>string</type>
                  </object>
                  <object>
                        <name>Person_PhoneNumber_Social</name>
                        <type>string</type>
                  </object>
                  <object>
                        <name>Person_PhoneNumber_IsActive</name>
                        <type>bool</type>
                  </object>
            </object>
      </list>
 </object>


推荐答案

您说:

I need to check that user have only added new elements 
but have not deleted or changed old ones.

您能更精确地理解您的意思吗?

Can you be more precise about what you mean?

例如,如果我在某个位置插入新的对象元素,那么我已经更改了其中的每个元素,对吗?包含它的列表和其他对象的数量。实际上,任何插入根本就是对根元素的更改。

For example, if I insert a new "object" element somewhere, I've changed every element it's inside of, right? As many lists and other objects as contain it. In fact, any insertion at all is a change to the root element.

因此,大概您想计数更改,除了根元素外什么都不会更改。如何将新项目添加到您显示的列表中?您是否希望列表被视为已更改?或者如果将列表中的对象或列表本身移到新位置而不更改其内容怎么办?

So, presumably you want to not count changes that change nothing but the root element. How about adding a new item to the list you show? Do you want the list to count as changed? Or what if the objects in the list, or the list itself, are moved to new places without having their content changed at all?

每种可能性都很容易实现

Each of those possibilities is pretty easy to write, but one has to decide what counts as a change first.

例如,如果您只关心底层对象,那么相同就意味着相同的文本内容(没有属性,空格变化等),那么最简单的方法是将之前文件加载到(名称,类型)对列表中;然后将之后文件加载到类似但单独的列表中。对两个列表进行排序,然后同时运行它们,并在新列表中报告旧列表中没有的任何东西(以防万一,您可能还希望报告所有删除)。

If, for example, you only care about bottom-level objects, and "the same" means precisely the same text content (no attributes, white-space variations, etc. etc.), then the easiest way is to load the "before" file into a list of (name,type) pairs; then load the "after" file into a similar but separate list. Sort both lists, then run down them simultaneously and report anything in the new one that's not in the old one (you'll probably want to report any deletions too, just in case).

这篇关于C#XML差异算法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆