Parallel.ForEach挂起一个大循环 [英] Parallel.ForEach hangs for a large loop

查看:230
本文介绍了Parallel.ForEach挂起一个大循环的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对我使用TPL并行化循环的实现。我使用的是戴尔笔记本拥有4GB内存和酷睿i3处理器。我有多个 parallel.foreach 这是使用进行调用 Parallel.invoke 。这个程序是一个插件,以企业架构师创建于EA的模型图和对象

I have an implementation of for loops that I am parallelizing using TPL. I am using a Dell laptop with 4GB RAM and i3 Core processor. I have multiple parallel.foreach which are invoked using Parallel.invoke. This program is an addin to Enterprise Architect for creating the model diagram and objects in EA.

代码是这样的:

Parallel.invoke(()=>parent1Creation(),()=>parent2Creation(),...);



,其中每个家长的创造是一个 Parallel.foreach

Parallel.foreach(parents, (parent) => {
    //create parent 
    //create children
    for(child in parent.children) {
        childecreation();
    }

    for(child2 in parent.children) {
        childecreation();
    }
    //can be any type and number of children
} 

$ b任何类型和数量
$ b

我有,当我的环路面积的增加即约1500-2000迭代,企业架构师停止工作的问题。

I have an issue that when my loop size increases i.e around 1500-2000 iterations, Enterprise Architect stops working.

这是一个问题因为我的笔记本电脑配置或方式,我使用的并行循环或企业架构师。

Is this an issue because of my laptop configuration or the way I am using parallel loops or with Enterprise architect.

我怎样才能解决这个问题。

How can I resolve it.

推荐答案

我不建议这样的策略。很多运行的Parallel.ForEach循环一次不一定会帮助你的表现(见后面在后的警告),特别是当每个Parallel.ForEach循环被处理大量的迭代。在某些时候,使用额外的线程将不再有利于你的性能,只会增加开销。

I don't suggest this strategy. Running lots of Parallel.ForEach loops at once won't necessarily help your performance (see the caveat later in the post), especially if each of the Parallel.ForEach loops is handling a large number of iterations. At some point, using additional threads won't benefit your performance anymore and will just add overhead.

这里需要说明的是,Parallel.ForEach一般在选择最佳线程数为特定的foreach循环好(但并不完美)。有没有明确的担保作为一个特定的foreach循环将到底有多少线程使用(甚至可以说,它的将会的并行运行),因此这是可以想象多个Parallel.ForEach循环会,事实上,提高你的性能。检查的最佳方法是使用调试器,看看它的实际使用在任何给定点多少个线程。如果它不是你所期望的是什么,你可以检查代码中的Parallel.ForEach循环的实现(例如);还有你可以采取在这一点上,试图提高性能的其他步骤(例如,一个良好的异步/等待执行IO的限制和其他非CPU绑定操作,使得线程可以做更多的工作 - 见下文)。

The caveat here is that Parallel.ForEach is generally good (but not perfect) at selecting the optimal number of threads for a particular foreach loop. There's no explicit guarantee as to exactly how many threads a particular foreach loop will use (or even that it will run in parallel), so it's conceivable that multiple Parallel.ForEach loops will, in fact, enhance your performance. The best way to check that is to use the debugger to see how many threads it's actually using at any given point. If it's not what you'd expect, you might check the implementation of the code in the Parallel.ForEach loop (for example); there are other steps you could take at this point to try to improve the performance (e.g. a good async/await implementation for IO-bound and other non-CPU-bound operations so that the thread can do more work - see below).

简单的例子:假设你有一个系统,你有4个线程和4芯和4线程是在系统上运行的唯一的东西。 (显然,这永远不会发生)。从一个调度点明智的事情就是让每个内核处理一个线程每个。假设每个线程的繁忙所有的时间(即它永远不会坐等)怎么会增加额外的线程提高你的表现?如果你开始运行,例如,6线程那么显然至少有一个核心将现在至少运行2个线程,它增加了额外的开销,没有明显的益处。这里的简化(也可能是不真实的)假设你的任务是100%的CPU绑定的,而且线程,其实在不同的内核上运行。如果这些假设之一是不真实的,这对增强一个明显的机会。例如,如果一个线程花费的时间从IO绑定操作等待结果的显著量,在CPU上的多个线程可以,实际上,提高性能。你也可以考虑异步/的await实施以提高性能。

Trivial example: suppose you have a system where you have 4 threads and 4 cores and the 4 threads are the only things that are running on the system. (Obviously this'll never happen). The sensible thing from a scheduling point of view would be to have each core handle one thread each. Assuming that each of the threads is busy all the time (i.e. it's never sitting around waiting) how could adding additional threads improve your performance? If you start running, for example, 6 threads then obviously at least one core will now have to run at least 2 threads, which adds extra overhead with no clear benefit. The simplifying (and possibly untrue) assumptions here are that your tasks are 100% CPU-bound and that the threads are, in fact, running on separate cores. If one of these assumptions are untrue, that's a clear opportunity for enhancement. For example, if a thread spends a significant amount of time waiting for results from IO-bound operations, multiple threads on the CPU could, in fact, improve performance. You could also consider an async/await implementation to improve performance.

问题的关键是,在某些时候添加额外的线程不会给你任何的性能优势,只是增加了开销(特别是如果涉及的任务大部分是CPU约束而非大多IO的限制,例如)。还有围绕这一事实没有任何办法。

The point being that at some point adding additional threads won't give you any performance benefit, just added overhead (especially if the tasks involved are mostly CPU-bound rather than mostly IO-bound, for example). There's no way around that fact.

非CPU绑定操作(IO密集型任务来电来样服务器,例如),其中主要滞留在等待结果从对CPU /内存外的东西不同的是并行。事实上,异步/的await做的的去创造新的线程;它的主要行为之一,是有问题的调用者返回控制的方法和尝试做同样的线程,如果可能的其它工作。

Non-CPU-bound operations (IO-bound tasks like calls to servers, for example) where the main holdup is waiting for a result from something external to the CPU/memory are parallelized differently. In fact, async/await does not necessarily create new threads; one of its major behaviors is to return control to the method in question's caller and "try" to do other work on the same thread if possible.

要重复我最喜欢的比喻,假设你出去吃一组10人的一部分。当服务员来通过接订单,第一帅哥服务员询问订购还没有准备好,但其他九人。为服务员做的是,而不是等待第一个男人的正确的事情要准备好秩序,有另外9个人订购,然后再有第一个男人为了事后如果他准备好即可。他绝对不会的的带来第二个服务生等待一个人做好准备;在这种情况下,第二服务员可能不会实际减少的时间采取的完成订单的总量。这基本上就是异步/的await试图完成;如果所有的动作做的是等待服务器的结果,例如,最好你能,而它的等待做其他事情。

To repeat my favorite analogy, suppose that you go out to eat as part of a group of 10 people. When the waiter comes by to take orders, the first guy the waiters asks to order isn't ready but the other nine people are. The correct thing for the waiter to do is, rather than wait for the first guy to be ready to order, to have the other 9 people order first and then have the first guy order afterwards if he's ready by then. He definitely does not bring in a second waiter to wait for the one guy to be ready; in this case, the second waiter probably wouldn't actually reduce the total amount of time taken to complete the order. This is basically what async/await tries to accomplish; if all an operation is doing is waiting for a result from a server, for example, ideally you'd be able to do other things while it's waiting.

在其他另一方面,延长比喻,它绝对的的是,服务员居然做了一顿自己的情况。在这种情况下,增加更多的人(类推,线程)将真正加快速度。

On the other hand, to extend the analogy, it's definitely not the case that the waiter actually makes the meal itself. In that case, adding more people (by analogy, threads) would genuinely speed things up.

要进一步扩展的比喻,如果所有的厨房里有一个四燃烧炉,然后有一个硬性限制他们跑入火炉的大小强加的硬限制之前,你可以有多少人加入到厨房工作人员。一旦你打的限制,更多的厨房工作人员将真正慢下来的东西,因为他们会仅仅是在对方的方式获得,因为有一个硬性限制的东西,实际上可以在一次做饭的次数。无论你的厨房工作人员有多大,你不可能有超过4个项目同时在炉子上做饭。在这种情况下,有核的数量是如厨房大小;一旦你达到某个点,增加更多的厨房工作人员(线程)将会从你的表现减损(不提升的话)。

To extend the analogy even further, if all the kitchen has is a four-burner stove, then there's a hard limit to how many people you can add to the kitchen staff before they run into the hard limit imposed by the stove size. Once you hit that limit, more kitchen staff will actually slow things down because they'll just be getting in each other's way because there's a hard limit to the number of things that can actually be cooking at once. No matter how big your kitchen staff is, you can't possibly have more than 4 items cooking on the stove at once. In this case, the number of cores you have is like the kitchen size; once you reach a certain point, adding more kitchen staff (threads) will detract from your performance (not enhance it).

这篇关于Parallel.ForEach挂起一个大循环的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆