管理TPL队列 [英] Managing the TPL Queue

查看:211
本文介绍了管理TPL队列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个运行的各种服务器的扫描服务。有问题的网络可以是巨大的(几十万的网络节点)。

该软件的最新版本是使用队列/线程架构由我们设计了工作,但效率不高,因为它可能是<(其中最重要的工作,因为可以酿出这是处理不好的孩子)/ P>

V2快到了,我使用TPL考虑。现在看来似乎应该非常适合。

我见过这个问题,答案,这意味着有没有限制任务TPL可以处理。在我的简单测试(自旋向上十万任务,并给他们TPL),太平人寿barfed很早就与外的内存异常(不够公平 - 尤其是在我开发框)。

扫描花费的时间变长,但5分钟/任务,是一个很好的平均水平。

你可以想像,扫描庞大的网络,可能需要相当长的时间,甚至在结实的服务器上。

我已经有了到位的框架,允许扫描作业(存储在数据库)的多个扫描服务器之间进行分割,但问题是我应该如何准确传递工作向TPL特定服务器上。

我可以监控TPL的队列的大小和(说)最糟糕的是,如果它低于几百项?是否有一个缺点这样做?

我还需要处理,其中的扫描需要暂停的情况。这似乎更容易做,不给工作,TPL不是取消/重置任务可能已部分处理。

所有的初始任务可以按任何顺序执行。后父已开始执行的儿童必须运行,但由于父派生他们,这不应该永远是一个问题。孩子们可以以任何顺序执行。正因为如此,我目前正在构想的子任务被写回数据库不能直接催生成TPL。这将允许其他服务器工作窃取如果需要的话。

有没有人有使用TPL这样的经验吗?是否有我需要知道的任何方面的考虑?

解决方案

太平人寿即将开始工作的小户型和并行运行它们。它的没有的有关监视,暂停或节流这项工作。

您应该看到太平人寿作为低级工具启动的工作,并同步线程。

关键点:TPL任务=逻辑任务!逻辑任务是在你的情况下扫描任务(扫描IP范围从X到Y)。这样的任务应该的没有的对应物理任务System.Threading.Task,因为两者是不同的概念。

您需要安排,协调,监督和暂停逻辑任务自己,因为TPL不理解他们,不能作出。

现在更多的实际问题:

  1. TPL肯定能启动10万任务,而不OOM。该OOM发生,因为的您的任务code 的耗尽内存。
  2. 在扫描网络,听起来像异步code很大的情况下,因为当你扫描你很可能同时有并行的有很大程度的等待结果。你可能不希望有500个线程的过程中都在等待一个网络数据包到达。异步任务十分符合第三方物流,因为你运行的每个任务变成纯粹的CPU绑定和小。这是甜蜜点TPL。

I've got a service that runs scans of various servers. The networks in question can be huge (hundreds of thousands of network nodes).

The current version of the software is using a queueing/threading architecture designed by us which works but isn't as efficient as it could be (not least of which because jobs can spawn children which isn't handled well)

V2 is coming up and I'm considering using the TPL. It seems like it should be ideally suited.

I've seen this question, the answer to which implies there's no limit to the tasks TPL can handle. In my simple tests (Spin up 100,000 tasks and give them to TPL), TPL barfed fairly early on with an Out-Of-Memory exception (fair enough - especially on my dev box).

The Scans take a variable length of time but 5 mins/task is a good average.

As you can imagine, scans for huge networks can take a considerable length of time, even on beefy servers.

I've already got a framework in place which allows the scan jobs (stored in a Db) to be split between multiple scan servers, but the question is how exactly I should pass work to the TPL on a specific server.

Can I monitor the size of TPL's queue and (say) top it up if it falls below a couple of hundred entries? Is there a downside to doing this?

I also need to handle the situation where a scan needs to be paused. This is seems easier to do by not giving the work to TPL than by cancelling/resetting tasks which may already be partially processed.

All of the initial tasks can be run in any order. Children must be run after the parent has started executing but since the parent spawns them, this shouldn't ever be a problem. Children can be run in any order. Because of this, I'm currently envisioning that child tasks be written back to the Db not spawned directly into TPL. This would allow other servers to "work steal" if required.

Has anyone had any experience with using the TPL in this way? Are there any considerations I need to be aware of?

解决方案

TPL is about starting small units of work and running them in parallel. It is not about monitoring, pausing, or throttling this work.

You should see TPL as a low-level tool to start "work" and to synchronize threads.

Key point: TPL tasks != logical tasks. Logical tasks are in your case scan-tasks ("scan an ip-range from x to y"). Such a task should not correspond to a physical task "System.Threading.Task" because the two are different concepts.

You need to schedule, orchestrate, monitor and pause the logical tasks yourself because TPL does not understand them and cannot be made to.

Now the more practical concerns:

  1. TPL can certainly start 100k tasks without OOM. The OOM happened because your tasks' code exhausted memory.
  2. Scanning networks sounds like a great case for asynchronous code because while you are scanning you are likely to wait on results while having a great degree of parallelism. You probably don't want to have 500 threads in your process all waiting for a network packet to arrive. Asynchronous tasks fit well with the TPL because every task you run becomes purely CPU-bound and small. That is the sweet spot for TPL.

这篇关于管理TPL队列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆