Parallel.ForEach 不断产生新线程 [英] Parallel.ForEach keeps spawning new threads

查看:31
本文介绍了Parallel.ForEach 不断产生新线程的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我在我的程序中使用 Parallel.ForEach 时,我发现有些线程似乎永远不会完成.事实上,它不断地产生新的线程,这是我没想到也绝对不想要的行为.

While I was using Parallel.ForEach in my program, I found that some threads never seemed to finish. In fact, it kept spawning new threads over and over, a behaviour that I wasn't expecting and definitely don't want.

我能够使用以下代码重现此行为,就像我的真实"程序一样,都大量使用处理器和内存(.NET 4.0 代码):

I was able to reproduce this behaviour with the following code which, just like my 'real' program, both uses processor and memory a lot (.NET 4.0 code):

public class Node
{
    public Node Previous { get; private set; }

    public Node(Node previous)
    {
        Previous = previous;
    }
}

public class Program
{
    public static void Main(string[] args)
    {
        DateTime startMoment = DateTime.Now;
        int concurrentThreads = 0;

        var jobs = Enumerable.Range(0, 2000);
        Parallel.ForEach(jobs, delegate(int jobNr)
        {
            Interlocked.Increment(ref concurrentThreads);

            int heavyness = jobNr % 9;

            //Give the processor and the garbage collector something to do...
            List<Node> nodes = new List<Node>();
            Node current = null;
            for (int y = 0; y < 1024 * 1024 * heavyness; y++)
            {
                current = new Node(current);
                nodes.Add(current);
            }

            TimeSpan elapsed = DateTime.Now - startMoment;
            int threadsRemaining = Interlocked.Decrement(ref concurrentThreads);
            Console.WriteLine("[{0:mm\:ss}] Job {1,4} complete. {2} threads remaining.", elapsed, jobNr, threadsRemaining);
        });
    }
}

在我的四核上运行时,它最初以 4 个并发线程开始,正如您所期望的那样.然而,随着时间的推移,越来越多的线程被创建.最终,这个程序会抛出一个 OutOfMemoryException:

When run on my quad-core, it initially starts of with 4 concurrent threads, just as you would expect. However, over time more and more threads are being created. Eventually, this program then throws an OutOfMemoryException:

[00:00] Job    0 complete. 3 threads remaining.
[00:01] Job    1 complete. 4 threads remaining.
[00:01] Job    2 complete. 4 threads remaining.
[00:02] Job    3 complete. 4 threads remaining.
[00:05] Job    9 complete. 5 threads remaining.
[00:05] Job    4 complete. 5 threads remaining.
[00:05] Job    5 complete. 5 threads remaining.
[00:05] Job   10 complete. 5 threads remaining.
[00:08] Job   11 complete. 5 threads remaining.
[00:08] Job    6 complete. 5 threads remaining.
...
[00:55] Job   67 complete. 7 threads remaining.
[00:56] Job   81 complete. 8 threads remaining.
...
[01:54] Job  107 complete. 11 threads remaining.
[02:00] Job  121 complete. 12 threads remaining.
..
[02:55] Job  115 complete. 19 threads remaining.
[03:02] Job  166 complete. 21 threads remaining.
...
[03:41] Job  113 complete. 28 threads remaining.
<OutOfMemoryException>

以上实验的内存使用图如下:

The memory usage graph for the experiment above is as follows:

(屏幕截图是荷兰语;上半部分表示处理器使用情况,下半部分内存使用情况.)如您所见,几乎每次垃圾处理时都会产生一个新线程收集器妨碍了(从内存使用量的下降可以看出).

(The screenshot is in Dutch; the top part represents processor usage, the bottom part memory usage.) As you can see, it looks like a new thread is being spawned almost every time the garbage collector gets in the way (as can be seen in the dips of memory usage).

谁能解释为什么会发生这种情况,我能做些什么?我只希望 .NET 停止产生新线程,并先完成现有线程...

Can anyone explain why this is happening, and what I can do about it? I just want .NET to stop spawning new threads, and finish the existing threads first...

推荐答案

您可以通过使用 MaxDegreeOfParallelism 指定 ParallelOptions 实例来限制创建的最大线程数属性集:

You can limit the maximum number of threads that get created by specifying a ParallelOptions instance with the MaxDegreeOfParallelism property set:

var jobs = Enumerable.Range(0, 2000);
ParallelOptions po = new ParallelOptions
{ 
    MaxDegreeOfParallelism = Environment.ProcessorCount
};

Parallel.ForEach(jobs, po, jobNr =>
{
    // ...
});

至于为什么你会得到你正在观察的行为:默认情况下,TPL (它是 PLINQ 的基础) 可以随意猜测最佳数字要使用的线程数.每当并行任务阻塞时,任务调度器可能会创建一个新线程以保持进度.在您的情况下,阻塞可能是隐式发生的;例如,通过 Console.WriteLine 调用,或(如您所见)在垃圾回收期间.

As to why you're getting the behaviour you're observing: The TPL (which underlies PLINQ) is, by default, at liberty to guess the optimal number of threads to use. Whenever a parallel task blocks, the task scheduler may create a new thread in order to maintain progress. In your case, the blocking might be happening implicitly; for example, through the Console.WriteLine call, or (as you observed) during garbage collection.

来自 使用任务并行库进行并发级别调整(如何要使用许多线程?):

由于 TPL 默认策略是每个处理器使用一个线程,我们可以得出结论,TPL 最初假设任务的工作负载为 ~100% 工作和 0% 等待,如果初始假设失败并且任务进入等待状态(即开始阻塞)- TPL 可以根据需要随意添加线程.

Since the TPL default policy is to use one thread per processor, we can conclude that TPL initially assumes that the workload of a task is ~100% working and 0% waiting, and if the initial assumption fails and the task enters a waiting state (i.e. starts blocking) - TPL with take the liberty to add threads as appropriate.

这篇关于Parallel.ForEach 不断产生新线程的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆