比较两个通用列表差异的最快方法 [英] Quickest way to compare two generic lists for differences

查看:53
本文介绍了比较两个通用列表差异的最快方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

比较两个大型项目(> 50.000个项目)最快(且耗费最少的资源)是什么,因此有两个列表如下所示:

What is the quickest (and least resource intensive) to compare two massive (>50.000 items) and as a result have two lists like the ones below:

  1. 出现在第一个列表中但不在第二个列表中的项目
  2. 出现在第二个列表中但不在第一个列表中的项目

目前,我正在使用List或IReadOnlyCollection,并在linq查询中解决此问题:

Currently I'm working with the List or IReadOnlyCollection and solve this issue in a linq query:

var list1 = list.Where(i => !list2.Contains(i)).ToList();
var list2 = list2.Where(i => !list.Contains(i)).ToList();

但这并没有我想要的那样好. 有什么想法可以像我需要处理大量列表那样使此过程更快且资源占用更少?

But this doesn't perform as good as i would like. Any idea of making this quicker and less resource intensive as i need to process a lot of lists?

推荐答案

使用 Except :

var firstNotSecond = list1.Except(list2).ToList();
var secondNotFirst = list2.Except(list1).ToList();

我怀疑有些方法实际上比这要快一些,但是即使这样,它们也比您的O(N * M)方法要快得多.

I suspect there are approaches which would actually be marginally faster than this, but even this will be vastly faster than your O(N * M) approach.

如果要结合使用这些方法,可以使用上述方法创建一个方法,然后再使用return语句:

If you want to combine these, you could create a method with the above and then a return statement:

return !firstNotSecond.Any() && !secondNotFirst.Any();

需要注意的一点是,问题中的原始代码与此处的解决方案之间的结果是 :仅在一个列表中的所有重复元素将只与我的报告一次代码,而报告它们的次数与原始代码中发生的次数相同.

One point to note is that there is a difference in results between the original code in the question and the solution here: any duplicate elements which are only in one list will only be reported once with my code, whereas they'd be reported as many times as they occur in the original code.

例如,对于具有[1, 2, 2, 2, 3][1]的列表,原始代码中的"list1中的元素而不是list2"结果为[2, 2, 2, 3].用我的代码,它只是[2, 3].在许多情况下,这不是问题,但值得注意.

For example, with lists of [1, 2, 2, 2, 3] and [1], the "elements in list1 but not list2" result in the original code would be [2, 2, 2, 3]. With my code it would just be [2, 3]. In many cases that won't be an issue, but it's worth being aware of.

这篇关于比较两个通用列表差异的最快方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆