“窃取工作"指的是与“耸耸肩"? [英] "Work stealing" vs. "Work shrugging"?

查看:91
本文介绍了“窃取工作"指的是与“耸耸肩"?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

为什么作为动态负载平衡策略,我可以找到很多有关偷工作"的信息,而找不到关于耸耸肩"的信息?

Why is it that I can find lots of information on "work stealing" and nothing on "work shrugging" as a dynamic load-balancing strategy?

耸耸肩"是指将多余的工作从繁忙的处理器中推到负载较少的邻居上,而不是让闲置的处理器从繁忙的邻居中进行工作(工作" -偷").

By "work-shrugging" I mean pushing surplus work away from busy processors onto less loaded neighbours, rather than have idle processors pulling work from busy neighbours ("work-stealing").

我认为两种策略的总体可伸缩性都应该相同.但是,我认为,就延迟和时间而言,它的效率要高得多.功耗,以便在确实有工作要做时唤醒空闲处理器,而不是让所有空闲处理器定期轮询所有邻居以进行可能的工作.

I think the general scalability should be the same for both strategies. However I believe that it is much more efficient, in terms of latency & power consumption, to wake an idle processor when there is definitely work for it to do, rather than having all idle processors periodically polling all neighbours for possible work.

无论如何,一个快速的google都不会在耸耸肩"或类似标题的标题下显示任何内容,因此,欢迎使用指向该技术的现有技术和术语的指针.

Anyway a quick google didn't show up anything under the heading of "Work Shrugging" or similar so any pointers to prior-art and the jargon for this strategy would be welcome.

澄清

我实际上设想工作提交处理器(可能是目标处理器,也可能不是目标处理器)负责查看首选目标处理器的即时位置(基于数据/代码位置)到决定是否应该给附近的邻居一个新工作,因为他们没有太多工作要做.

I actually envisage the work submitting processor (which may or may not be the target processor) being responsible for looking around the immediate locality of the preferred target processor (based on data/code locality) to decide if a near neighbour should be given the new work instead because they don't have as much work to do.

我认为,决策逻辑所需要的不仅仅是原子读取相邻邻居(通常为2到4个)的估计q长度.我认为这与盗贼投票和暗指所隐含的耦合程度没有多大关系.从邻居那里偷东西. (我在两种策略中都假定队列为无锁,无等待".

I dont think the decision logic would require much more than an atomic read of the immediate (typically 2 to 4) neighbours' estimated q length here. I do not think this is any more coupling than implied by the thieves polling & stealing from their neighbours. (I am assuming "lock-free, wait-free" queues in both strategies).

分辨率

我所说的工作耸耸肩"策略似乎(但仅作部分描述!)属于正常的"前期调度策略的范畴,该策略在处理器,缓存和放大器管理方面很聪明.记忆的忠诚度和可扩展性.

It seems that what I meant (but only partially described!) as "Work Shrugging" strategy is in the domain of "normal" upfront scheduling strategies that happen to be smart about processor, cache & memory loyality, and scaleable.

我找到了很多搜索这些术语的参考文献,其中一些看起来很扎实.当我确定最符合(或拆除!)我对工作耸耸肩"的定义所想到的逻辑的参考文献.

I find plenty of references searching on these terms and several of them look pretty solid. I will post a reference when I identify one that best matches (or demolishes!) the logic I had in mind with my definition of "Work Shrugging".

推荐答案

负载均衡不是免费的.这需要上下文切换(到内核),查找空闲处理器以及选择要重新分配的工作.尤其是在每时每刻数十次切换任务的机器上,这笔费用加起来了.

Load balancing is not free; it has a cost of a context switch (to the kernel), finding the idle processors, and choosing work to reassign. Especially in a machine where tasks switch all the time, dozens of times per second, this cost adds up.

那有什么区别?工作调整意味着您进一步负担了超额配置的资源(繁忙的处理器)以及负载平衡的开销.当隔壁没有处理器可做的事情时,为什么要用管理员来中断繁忙的处理器?另一方面,工作窃取使闲置的处理器运行负载平衡器,而繁忙的处理器则继续工作.窃取工作可以节省时间.

So what's the difference? Work-shrugging means you further burden over-provisioned resources (busy processors) with the overhead of load-balancing. Why interrupt a busy processor with administrivia when there's a processor next door with nothing to do? Work stealing, on the other hand, lets the idle processors run the load balancer while busy processors get on with their work. Work-stealing saves time.

考虑:处理器A分配有两个任务.它们分别花费时间 a1 a2 .附近的处理器B(可能是缓存反弹的距离)处于空闲状态.处理器在所有方面都是相同的.我们假设每个任务的代码和内核都位于两个处理器的i-cache中(在负载平衡方面不会增加页面错误).

Consider: Processor A has two tasks assigned to it. They take time a1 and a2, respectively. Processor B, nearby (the distance of a cache bounce, perhaps), is idle. The processors are identical in all respects. We assume the code for each task and the kernel is in the i-cache of both processors (no added page fault on load balancing).

任何类型的上下文切换(包括负载平衡)需要花费时间 c.

A context switch of any kind (including load-balancing) takes time c.

完成任务的时间为 a1 + a2 + c.处理器A将完成所有工作,并在两个任务之间进行一次上下文切换.

The time to complete the tasks will be a1 + a2 + c. Processor A will do all the work, and incur one context switch between the two tasks.

假设B窃取了 a2 ,这本身就导致了上下文切换时间.该工作将在 max(a1,a2 + c)时间内完成.假设处理器A在 a1上开始工作; 在这样做的同时,处理器B将窃取 a2 并避免对a1的处理造成任何中断. B上的所有开销都是自由循环.

Assume B steals a2, incurring the context switch time itself. The work will be done in max(a1, a2 + c) time. Suppose processor A begins working on a1; while it does that, processor B will steal a2 and avoid any interruption in the processing of a1. All the overhead on B is free cycles.

如果 a2 是较短的任务,则在此情况下,您已经有效地隐藏了上下文切换的成本;总时间为 a1.

If a2 was the shorter task, here, you have effectively hidden the cost of a context switch in this scenario; the total time is a1.

假设B如上所述完成 a2 ,但是A承担了移动它的费用(耸耸肩"工作).这种情况下的工作将在 max(a1,a2)+ c 时间完成;现在,上下文切换始终是总时间之外的 ,而不是隐藏的.此处,处理器B的空闲周期已被浪费;相反,繁忙的处理器A已将时间消​​耗的精力浪费在了B上.

Assume B completes a2, as above, but A incurs the cost of moving it ("shrugging" the work). The work in this case will be done in max(a1, a2) + c time; the context switch is now always in addition to the total time, instead of being hidden. Processor B's idle cycles have been wasted, here; instead, a busy processor A has burned time shrugging work to B.

这篇关于“窃取工作"指的是与“耸耸肩"?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆