Parallel.ForEach保持生成新线程 [英] Parallel.ForEach keeps spawning new threads
问题描述
当我在我的程序中使用Parallel.ForEach时,我发现有些线程似乎没有完成。事实上,它一直反复产生新的线程,一个我没有想到,绝对不想要的行为。
While I was using Parallel.ForEach in my program, I found that some threads never seemed to finish. In fact, it kept spawning new threads over and over, a behaviour that I wasn't expecting and definitely don't want.
我能够重现这种行为下面的代码,就像我的真正的程序,都使用处理器和内存很多(.NET 4.0代码):
I was able to reproduce this behaviour with the following code which, just like my 'real' program, both uses processor and memory a lot (.NET 4.0 code):
public class Node
{
public Node Previous { get; private set; }
public Node(Node previous)
{
Previous = previous;
}
}
public class Program
{
public static void Main(string[] args)
{
DateTime startMoment = DateTime.Now;
int concurrentThreads = 0;
var jobs = Enumerable.Range(0, 2000);
Parallel.ForEach(jobs, delegate(int jobNr)
{
Interlocked.Increment(ref concurrentThreads);
int heavyness = jobNr % 9;
//Give the processor and the garbage collector something to do...
List<Node> nodes = new List<Node>();
Node current = null;
for (int y = 0; y < 1024 * 1024 * heavyness; y++)
{
current = new Node(current);
nodes.Add(current);
}
TimeSpan elapsed = DateTime.Now - startMoment;
int threadsRemaining = Interlocked.Decrement(ref concurrentThreads);
Console.WriteLine("[{0:mm\\:ss}] Job {1,4} complete. {2} threads remaining.", elapsed, jobNr, threadsRemaining);
});
}
}
在我的四核上运行时,的4个并发线程,正如你所期望的。然而,随着时间的推移,越来越多的线程被创建。最终,此程序抛出一个OutOfMemoryException:
When run on my quad-core, it initially starts of with 4 concurrent threads, just as you would expect. However, over time more and more threads are being created. Eventually, this program then throws an OutOfMemoryException:
[00:00] Job 0 complete. 3 threads remaining.
[00:01] Job 1 complete. 4 threads remaining.
[00:01] Job 2 complete. 4 threads remaining.
[00:02] Job 3 complete. 4 threads remaining.
[00:05] Job 9 complete. 5 threads remaining.
[00:05] Job 4 complete. 5 threads remaining.
[00:05] Job 5 complete. 5 threads remaining.
[00:05] Job 10 complete. 5 threads remaining.
[00:08] Job 11 complete. 5 threads remaining.
[00:08] Job 6 complete. 5 threads remaining.
...
[00:55] Job 67 complete. 7 threads remaining.
[00:56] Job 81 complete. 8 threads remaining.
...
[01:54] Job 107 complete. 11 threads remaining.
[02:00] Job 121 complete. 12 threads remaining.
..
[02:55] Job 115 complete. 19 threads remaining.
[03:02] Job 166 complete. 21 threads remaining.
...
[03:41] Job 113 complete. 28 threads remaining.
<OutOfMemoryException>
上述实验的内存使用情况图如下:
The memory usage graph for the experiment above is as follows:
(屏幕截图是荷兰语;顶部代表处理器使用情况,底部内存使用情况。)正如你可以看到,它看起来像一个新的线程被产生几乎每次垃圾收集器(从内存使用的下降中可以看出)。
(The screenshot is in Dutch; the top part represents processor usage, the bottom part memory usage.) As you can see, it looks like a new thread is being spawned almost every time the garbage collector gets in the way (as can be seen in the dips of memory usage).
任何人都可以解释为什么会发生这种情况,我能做些什么呢?我只想让.NET停止生成新线程,并完成现有线程...
Can anyone explain why this is happening, and what I can do about it? I just want .NET to stop spawning new threads, and finish the existing threads first...
推荐答案
通过使用 MaxDegreeOfParallelism
属性集指定 ParallelOptions
实例创建的线程数:
You can limit the maximum number of threads that get created by specifying a ParallelOptions
instance with the MaxDegreeOfParallelism
property set:
var jobs = Enumerable.Range(0, 2000);
ParallelOptions po = new ParallelOptions
{
MaxDegreeOfParallelism = Environment.ProcessorCount
};
Parallel.ForEach(jobs, po, jobNr =>
{
// ...
});
至于为什么默认情况下,TPL (基于PLINQ)可以自由猜测要使用的最佳线程数。每当并行任务阻塞时,任务调度器可以创建新线程以便保持进度。在你的情况下,阻塞可能是隐式发生;例如通过 Console.WriteLine
调用,或者(如你所观察到的)垃圾回收期间。
As to why you're getting the behaviour you're observing: The TPL (which underlies PLINQ) is, by default, at liberty to guess the optimal number of threads to use. Whenever a parallel task blocks, the task scheduler may create a new thread in order to maintain progress. In your case, the blocking might be happening implicitly; for example, through the Console.WriteLine
call, or (as you observed) during garbage collection.
a href =http://aviadezra.blogspot.com/2009/10/how-many-threads-tpl-concurrency.html>并发级别使用任务并行库进行调整(使用多少主题?) :
From Concurrency Levels Tuning with Task Parallel Library (How Many Threads to Use?):
由于TPL默认策略是每个处理器使用一个线程,我们可以得出结论,TPL最初假设任务的工作负载〜100%工作和0%等待,如果初始假设失败并且任务进入等待状态(即开始阻塞),TPL可以自由地添加线程。
Since the TPL default policy is to use one thread per processor, we can conclude that TPL initially assumes that the workload of a task is ~100% working and 0% waiting, and if the initial assumption fails and the task enters a waiting state (i.e. starts blocking) - TPL with take the liberty to add threads as appropriate.
这篇关于Parallel.ForEach保持生成新线程的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!