PLINQ是天生就比System.Threading.Tasks.Parallel.ForEach更快 [英] Is PLinq Inherently Faster than System.Threading.Tasks.Parallel.ForEach

查看:713
本文介绍了PLINQ是天生就比System.Threading.Tasks.Parallel.ForEach更快的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

摘要:我从System.Threading.Tasks.Parallel.ForEach和并发数据结构更改为一个简单的PLINQ(并行LINQ)查询。速度达是的惊人的。



所以,本质上是快于Parallel.ForEach PLINQ?或者是特定的任务。

  //原始代码
//并发字典存储结果
VAR resultDict =新ConcurrentDictionary<字符串,MyResultType>();

Parallel.ForEach(项目,项目= GT;
{
resultDict.TryAdd(item.Name,PerformWork(源));
});


//新的代码

变种结果=

.AsParallel()
。选择(项目= GT ;新{item.Name,QueryResult中= PerformWork(项目)})
.ToDictionary(KV => kv.SourceName,KV => kv.queryResult);

备注
每一个任务(PerformWork)现之间运行0和200毫秒。它曾经在我的优化它需要更长的时间。这就是为什么我用的是Tasks.Parallel库中的拳头地方。所以,我从2秒去总时间〜100-200 ms总时间,进行大致相同的工作,只是不同的方法。 (哇LINQ和PLINQ是真棒!)



问题




  1. 是加快因使用PLINQ VS Parallel.ForEach?

  2. 难道不是简单地去除并发数据结构(ConcurrentDictionary)的? (因为它不需要同步线程)。

  3. 基于来自此有关问题




而PLINQ主要是基于对编程没有副作用的功能性风格,副作用恰恰是TPL是。如果你想真正做并行工作,而不是只搜索/并行选择的东西,你可以使用TPL。




我可以假设,因为我的模式是基本功能(输入给予不产生新的突变输出),这是PLINQ使用正确的技术?



我在找验证我的假设是正确的,或者说我想的东西的指示。


解决方案

根据您已在样品中提供的信息有限(我问的更多详细信息在OP评论),我猜,由于确保你看到的差异所使用的划分算法。你应该在块分区与范围分区在他讨论它们之间的区别和哪些类型的工作,他们可能是最适合这个博客帖子。强烈建议您阅读博客文章中以及rel=\"nofollow\">这个其进入与其他两种类型的划分,可以用来沿着这两个类型的更详细一点,但并不适用于你的样品,以及给予一些视觉辅助,以便更好地理解分区。最后,这里的另一个博客文章,讨论工作分配,以及它如何影响你的时候默认分区算法不为特定的工作负载是有意义的。该职位实际上指的是一个伟大的计划,可以帮助你在工作中显现partitioners这是一组平行样本的在PFX小组。


Summary: I changed from System.Threading.Tasks.Parallel.ForEach and Concurrent Data structure to a simple plinq (Parallel Linq) query. The speed up was amazing.

So is plinq inherently faster than Parallel.ForEach? Or is it specific to the task.

// Original Code
// concurrent dictionary to store results
var resultDict = new ConcurrentDictionary<string, MyResultType>();

Parallel.ForEach(items, item =>
        {
            resultDict.TryAdd(item.Name, PerformWork(source));
        });


// new code

var results =
            items
            .AsParallel()
            .Select(item => new { item.Name, queryResult = PerformWork(item) })
            .ToDictionary(kv => kv.SourceName, kv => kv.queryResult);

Notes: Each task (PerformWork) now runs between 0 and 200 ms. It used to take longer before I optimized it. That's why I was using the Tasks.Parallel library in the fist place. So I went from 2 seconds total time to ~100-200 ms total time, performing roughly the same work, just with different methods. (Wow linq and plinq are awesome!)

Questions:

  1. Is the speed up due to using plinq vs Parallel.ForEach?
  2. Is it instead simply the removal of the concurrent data structure (ConcurrentDictionary)? (Because it doesn't need to synchronize threads).
  3. Based on the answer from this related question

Whereas PLINQ is largely based on a functional style of programming with no side-effects, side-effects are precisely what the TPL is for. If you want to actually do work in parallel as opposed to just searching/selecting things in parallel, you use the TPL.

Can I assume that because my pattern is basically functional (giving inputs produce new outputs without mutation), that plinq is the correct technology to use?

I'm looking for validation that my assumptions are correct, or an indication that I'm missing something.

解决方案

Based on the limited information you've provided in your sample (I asked for more details in a comment on the OP), I'm guessing sure you're seeing differences due to the partitioning algorithm that is used. You should read up on Chunk Partitioning vs. Range Partitioning in this blog post where he discusses how they differ and for which types of work they might be best suited for. Highly recommend you read that blog article as well as this one which goes into a little more detail on those two types along with two other types of partitioning that can be used, though not applicable to your sample, as well as giving some visual aids to better understand the partitioning. Finally, here's yet another blog post that discusses work partitioning and how it can affect you when the default partitioning algorithm doesn't make sense for your particular workload. That post actually refers to a great program that helps you visualize the partitioners at work that's part of a set of parallel samples from the PFX team.

这篇关于PLINQ是天生就比System.Threading.Tasks.Parallel.ForEach更快的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆