Parallel.ForEach保持生成新线程 [英] Parallel.ForEach keeps spawning new threads

查看:181
本文介绍了Parallel.ForEach保持生成新线程的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我在我的程序中使用Parallel.ForEach时,我发现有些线程似乎没有完成。事实上,它一直反复产生新的线程,一个我没有想到,绝对不想要的行为。

While I was using Parallel.ForEach in my program, I found that some threads never seemed to finish. In fact, it kept spawning new threads over and over, a behaviour that I wasn't expecting and definitely don't want.

我能够重现这种行为下面的代码,就像我的真正的程序,都使用处理器和内存很多(.NET 4.0代码):

I was able to reproduce this behaviour with the following code which, just like my 'real' program, both uses processor and memory a lot (.NET 4.0 code):

public class Node
{
    public Node Previous { get; private set; }

    public Node(Node previous)
    {
        Previous = previous;
    }
}

public class Program
{
    public static void Main(string[] args)
    {
        DateTime startMoment = DateTime.Now;
        int concurrentThreads = 0;

        var jobs = Enumerable.Range(0, 2000);
        Parallel.ForEach(jobs, delegate(int jobNr)
        {
            Interlocked.Increment(ref concurrentThreads);

            int heavyness = jobNr % 9;

            //Give the processor and the garbage collector something to do...
            List<Node> nodes = new List<Node>();
            Node current = null;
            for (int y = 0; y < 1024 * 1024 * heavyness; y++)
            {
                current = new Node(current);
                nodes.Add(current);
            }

            TimeSpan elapsed = DateTime.Now - startMoment;
            int threadsRemaining = Interlocked.Decrement(ref concurrentThreads);
            Console.WriteLine("[{0:mm\\:ss}] Job {1,4} complete. {2} threads remaining.", elapsed, jobNr, threadsRemaining);
        });
    }
}

在我的四核上运行时,的4个并发线程,正如你所期望的。然而,随着时间的推移,越来越多的线程被创建。最终,此程序抛出一个OutOfMemoryException:

When run on my quad-core, it initially starts of with 4 concurrent threads, just as you would expect. However, over time more and more threads are being created. Eventually, this program then throws an OutOfMemoryException:

[00:00] Job    0 complete. 3 threads remaining.
[00:01] Job    1 complete. 4 threads remaining.
[00:01] Job    2 complete. 4 threads remaining.
[00:02] Job    3 complete. 4 threads remaining.
[00:05] Job    9 complete. 5 threads remaining.
[00:05] Job    4 complete. 5 threads remaining.
[00:05] Job    5 complete. 5 threads remaining.
[00:05] Job   10 complete. 5 threads remaining.
[00:08] Job   11 complete. 5 threads remaining.
[00:08] Job    6 complete. 5 threads remaining.
...
[00:55] Job   67 complete. 7 threads remaining.
[00:56] Job   81 complete. 8 threads remaining.
...
[01:54] Job  107 complete. 11 threads remaining.
[02:00] Job  121 complete. 12 threads remaining.
..
[02:55] Job  115 complete. 19 threads remaining.
[03:02] Job  166 complete. 21 threads remaining.
...
[03:41] Job  113 complete. 28 threads remaining.
<OutOfMemoryException>

上述实验的内存使用情况图如下:

The memory usage graph for the experiment above is as follows:

屏幕截图是荷兰语;顶部代表处理器使用情况,底部内存使用情况。)正如你可以看到,它看起来像一个新的线程被产生几乎每次垃圾收集器(从内存使用的下降中可以看出)。

(The screenshot is in Dutch; the top part represents processor usage, the bottom part memory usage.) As you can see, it looks like a new thread is being spawned almost every time the garbage collector gets in the way (as can be seen in the dips of memory usage).

任何人都可以解释为什么会发生这种情况,我能做些什么呢?我只想让.NET停止生成新线程,并完成现有线程...

Can anyone explain why this is happening, and what I can do about it? I just want .NET to stop spawning new threads, and finish the existing threads first...

推荐答案

通过使用 MaxDegreeOfParallelism 属性集指定 ParallelOptions 实例创建的线程数:

You can limit the maximum number of threads that get created by specifying a ParallelOptions instance with the MaxDegreeOfParallelism property set:

var jobs = Enumerable.Range(0, 2000);
ParallelOptions po = new ParallelOptions
{ 
    MaxDegreeOfParallelism = Environment.ProcessorCount
};

Parallel.ForEach(jobs, po, jobNr =>
{
    // ...
});

至于为什么默认情况下,TPL (基于PLINQ)可以自由猜测要使用的最佳线程数。每当并行任务阻塞时,任务调度器可以创建新线程以便保持进度。在你的情况下,阻塞可能是隐式发生;例如通过 Console.WriteLine 调用,或者(如你所观察到的)垃圾回收期间。

As to why you're getting the behaviour you're observing: The TPL (which underlies PLINQ) is, by default, at liberty to guess the optimal number of threads to use. Whenever a parallel task blocks, the task scheduler may create a new thread in order to maintain progress. In your case, the blocking might be happening implicitly; for example, through the Console.WriteLine call, or (as you observed) during garbage collection.

a href =http://aviadezra.blogspot.com/2009/10/how-many-threads-tpl-concurrency.html>并发级别使用任务并行库进行调整(使用多少主题?) :

From Concurrency Levels Tuning with Task Parallel Library (How Many Threads to Use?):


由于TPL默认策略是每个处理器使用一个线程,我们可以得出结论,TPL最初假设任务的工作负载〜100%工作和0%等待,如果初始假设失败并且任务进入等待状态(即开始阻塞),TPL可以自由地添加线程。

Since the TPL default policy is to use one thread per processor, we can conclude that TPL initially assumes that the workload of a task is ~100% working and 0% waiting, and if the initial assumption fails and the task enters a waiting state (i.e. starts blocking) - TPL with take the liberty to add threads as appropriate.

这篇关于Parallel.ForEach保持生成新线程的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆