在大型数据集的可枚举LINQ查询结果上使用ToList()-效率问题? [英] Using ToList() on Enumerable LINQ query results for large data sets - Efficiency Issue?

查看：126 发布时间：2020/5/1 5:34:59 c# linq optimization .net-4.0 type-conversion

本文介绍了在大型数据集的可枚举LINQ查询结果上使用ToList()-效率问题?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我在当前正在编写的应用程序中大量使用了LINQ查询，而我一直遇到的一种情况是必须将LINQ查询结果转换为列表以进行进一步处理(我已经我想要清单的原因.

I've been making a lot of use of LINQ queries in the application I'm currently writing, and one of the situations that I keep running into is having to convert the LINQ query results into lists for further processing (I have my reasons for wanting lists).

自从我现在反复使用它以来，如果效率低下，我想更好地了解此列表转换中会发生什么.因此，鉴于我执行了这样的一行:

I'd like to have a better understanding of what happens in this list conversion in case there are inefficiencies since I've used it repeatedly now. So, given I execute a line line like this:

var matches = (from x in list1 join y in list2 on x equals y select x).ToList();

问题:

除了创建新列表及其引用从查询返回的Enumerable中的元素之外，这里是否还有其他开销?

Is there any overhead here aside from the creation of a new list and its population with references to the elements in the Enumerable returned from the query?

您认为这种效率低下吗?

Would you consider this inefficient?

是否有一种方法可以使LINQ查询直接生成列表，以避免在这种情况下进行转换?

Is there a way to get the LINQ query to directly generate a list to avoid the need for a conversion in this circumstance?

推荐答案

好，它会创建数据的副本. 可能效率低下-但要取决于发生了什么.如果最后需要一个List<T>，那么List<T>通常将接近您将获得的效率.唯一的例外是，如果您要只是进行转换并且源已经是列表，则使用ConvertAll会更高效，因为它可以创建对象的支持数组.合适的尺寸开始.

Well, it creates a copy of the data. That could be inefficient - but it depends on what's going on. If you need a List<T> at the end, List<T> is usually going to be close to as efficient as you'll get. The one exception to that is if you're going to just do a conversion and the source is already a list - then using ConvertAll will be more efficient, as it can create the backing array of the right size to start with.

如果仅需要流式传输数据-例如您只需要对其执行foreach，并采取不影响原始数据源的操作-然后调用ToList绝对是效率低下的潜在原因.它将强制对整个list1进行评估-如果这是一个延迟评估的序列(例如，随机数生成器中的前1,000,000个值")，那么那就不好了.请注意，在进行联接时，尝试从序列中提取第一个值(无论是否填充列表)时，list2无论如何都会被评估为 .

If you only need to stream the data - e.g. you're just going to do a foreach on it, and taking actions which don't affect the original data sources - then calling ToList is definitely a potential source of inefficiency. It will force the whole of list1 to be evaluated - and if that's a lazily-evaluated sequence (e.g. "the first 1,000,000 values from a random number generator") then that's not good. Note that as you're doing a join, list2 will be evaluated anyway as soon as you try to pull the first value from the sequence (whether that's in order to populate a list or not).

您可能想阅读我的 ToList 上的Edulinq帖子，至少在一种可能的实现方式中-在后台查看发生了什么.

You might want to read my Edulinq post on ToList to see what's going on - at least in one possible implementation - in the background.

这篇关于在大型数据集的可枚举LINQ查询结果上使用ToList()-效率问题?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在大型数据集的可枚举LINQ查询结果上使用ToList()-效率问题? [英] Using ToList() on Enumerable LINQ query results for large data sets - Efficiency Issue?

问题描述

推荐答案

相关文章

C#/.NET最新文章

热门教程

热门工具

登录关闭

在大型数据集的可枚举LINQ查询结果上使用ToList()-效率问题? [英] Using ToList() on Enumerable LINQ query results for large data sets - Efficiency Issue?

问题描述

推荐答案

相关文章

C#/.NET最新文章

热门教程

热门工具

登录 关闭

登录关闭