为什么 LINQ 操作可以比普通循环更快? [英] Why can LINQ operations be faster than a normal loop?

查看:35
本文介绍了为什么 LINQ 操作可以比普通循环更快?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在今天的编程讨论中,我和一个朋友有点困惑.举个例子,我们创建了一个虚构的问题,它有一个包含 n 个随机整数(通常为 1.000.000)的 List,并且想要创建一个函数来返回所有整数的集合比其中之一.很简单的东西.我们创建了一个 LINQ 语句来解决这个问题,以及一个基于普通插入排序的算法.

A friend and I were a bit perplexed during a programming discussion today. As an example, we created a fictive problem of having a List<int> of n random integers (typically 1.000.000) and wanted to create a function that returned the set of all integers that there were more than one of. Pretty straightforward stuff. We created one LINQ statement to solve this problem, and a plain insertion sort based algorithm.

现在,当我们测试代码运行的速度时(使用System.Diagnostics.StopWatch),结果令人困惑.LINQ 代码不仅优于简单排序,而且运行速度比单个 foreach/for 只执行一个循环列表中没有任何操作(从侧面来说,我认为编译器应该发现并删除所有内容).

Now, as we tested the speed the code ran at (using System.Diagnostics.StopWatch), the results were confusing. Not only did the LINQ code outperform the simple sort, but it ran faster than a single foreach/for that only did a single loop of the list, and that had no operations within (which, on a side track, I thought the compiler was supposed to discover and remove alltogether).

如果我们在程序的同一执行中生成一个新的随机数 List 并再次运行 LINQ 代码,则性能将提高几个数量级(通常是千倍).空循环的表现当然是一样的.

If we generated a new List<int> of random numbers in the same execution of the program and ran the LINQ code again, the performance would increase by orders of magnitude (typically thousandfold). The performance of the empty loops were of course the same.

那么,这里发生了什么?LINQ 是否使用并行性来胜过普通循环?这些结果怎么可能呢?LINQ 使用以 n*log(n) 运行的快速排序,每个定义已经比 n 慢.

So, what is going on here? Is LINQ using parallelism to outperform normal loops? How are these results even possible? LINQ uses quicksort which runs at n*log(n), which per definition is already slower than n.

第二次运行的性能飞跃发生了什么?

And what is happening at the performance leap on the second run?

我们对这些结果既困惑又好奇,希望社区提供一些澄清的见解,以满足我们自己的好奇心.

We were both baffled and intrigued at these results and were hoping for some clarifying insights from the community, just to satisfy our own curiosity.

推荐答案

毫无疑问,您并没有真正执行查询,您只是定义了它.LINQ 构造一个表达式树,在您执行需要迭代枚举的操作之前,该表达式树不会实际计算.尝试将 ToList()Count() 操作添加到 LINQ 查询以强制对查询进行评估.

Undoubtedly you haven't actually performed the query, you've merely defined it. LINQ constructs an expression tree that isn't actually evaluated until you perform an operation that requires that the enumeration be iterated. Try adding a ToList() or Count() operation to the LINQ query to force the query to be evaluated.

根据您的评论,我希望这与您所做的类似.注意:我没有花任何时间弄清楚查询是否尽可能有效;我只想要一些查询来说明代码的结构.

Based on your comment I expect this is similar to what you've done. Note: I haven't spent any time figuring out if the query is as efficient as possible; I just want some query to illustrate how the code may be structured.

var dataset = ...

var watch = Stopwatch.StartNew();

var query = dataset.Where( d => dataset.Count( i => i == d ) > 1 );

watch.Stop();  // timer stops here

foreach (var item in query) // query is actually evaluated here
{
   ... print out the item...
}

这篇关于为什么 LINQ 操作可以比普通循环更快?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆