fork / join框架如何比线程池更好? [英] How is the fork/join framework better than a thread pool?

查看:158
本文介绍了fork / join框架如何比线程池更好?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用新的分叉/加入框架有什么好处? a>只是简单地将大任务分成开头的N个子任务,将它们发送到缓存的线程池(来自 Executors )并等待每个任务完成?我没有看到使用fork / join抽象如何简化问题或使解决方案从我们多年来的工作中提高效率。



例如,并行化教程示例中的模糊算法可以像这样实现:

  public class Blur实现Runnable {
private int [] mSource;
private int mStart;
private int mLength;
private int [] mDestination;

private int mBlurWidth = 15; //处理窗口大小,应该是奇数。

public ForkBlur(int [] src,int start,int length,int [] dst){
mSource = src;
mStart = start;
mLength = length;
mDestination = dst;
}

public void run(){
computeDirectly();
}

protected void computeDirectly(){
//如示例中所示,为简洁而省略
}
}

在开头拆分并将任务​​发送到线程池:

  //源图像像素位于src 
//目标图像像素位于dst
// threadPool是一个(缓存)线程池

int maxSize = 100000; //类似于F-J的sThreshold
列表< Future> futures = new ArrayList< Future>();

//将东西发送到线程池:
for(int i = 0; i< src.length; i + = maxSize){
int size = Math.min( maxSize,src.length - i);
ForkBlur task = new ForkBlur(src,i,size,dst);
Future f = threadPool.submit(task);
futures.add(f);
}

//等待所有发送的任务完成:
for(Future future:futures){
future.get();
}

//完成!

任务进入线程池的队列,当工作线程可用时,它们将从该队列中执行。只要分裂足够精细(以避免必须特别等待最后一个任务)并且线程池有足够的(至少N个处理器)线程,所有处理器都在全速运行,直到整个计算完成。 / p>

我错过了什么吗?使用fork / join框架的附加值是什么?

解决方案

我认为基本的误解是,Fork / Join示例显示工作窃取,但只有某种标准的分而治之。



工作窃取将是这样的:工人B已经完成了他的工作。他是一个善良的人,所以他环顾四周,看到工人A仍然非常努力。他漫步并问道:嘿小伙子,我可以帮你一把。回复。 很酷,我有1000个单位的任务。到目前为止,我已经完成了345次离开655.你能不能在673到1000号工作,我会做346到672。 B说好的,让我们开始,这样我们就可以早点去酒吧。



你看 - 工人们必须在彼此之间进行沟通,即使他们开始真正的工作。这是示例中缺少的部分。



另一方面,示例仅显示使用分包商:



工人A:Dang,我有1000个单位的工作。对我来说太多了。我自己会做500个并将500个转包给其他人。这种情况一直持续到大任务被分解为每个10个单元的小包。这些将由可用的工人执行。但是,如果一个包是一种毒丸并且需要比其他包更长的时间 - 运气不好,分裂阶段就结束了。



Fork /之间唯一的区别事先加入并拆分任务是这样的:在前期拆分时,您可以从一开始就完整地完成工作队列。示例:1000个单位,阈值为10,因此队列有100个条目。这些数据包被分发给线程池成员。



Fork / Join更复杂,并试图将队列中的数据包数量保持较小:




  • 步骤1:将包含(1 ... 1000)的一个数据包放入队列

  • 步骤2:一名工作人员弹出数据包(1 ... 1000)并用两个包替换它:(1 ... 500)和(501 ... 1000)。

  • 步骤3:一个工人弹出数据包(500 ... 1000)并推送(500 ... 750)和(751 ... 1000)。

  • 步骤n:堆栈包含以下数据包:(1..500), (500 ... 750),(750 ... 875)...(991..1000)

  • 步骤n + 1:弹出数据包(991..1000)已执行

  • 步骤n + 2:弹出并执行数据包(981..990)

  • 步骤n + 3:数据包(961..980)弹出并分成(961 ... 970)和(971..980)。
    ....



您会看到:在Fork / Join队列中较小(示例中为6)并且分裂和工作阶段是交错的。



当多个工人同时弹出并推动时,互动当然不是那么清楚。


What are the benefits of using the new fork/join framework over just simply splitting the big task into N subtasks in the beginning, sending them to a cached thread pool (from Executors) and waiting for each task to complete? I fail to see how using the fork/join abstraction simplifies the problem or makes the solution more efficient from what we've had for years now.

For example, the parallelized blurring algorithm in the tutorial example could be implemented like this:

public class Blur implements Runnable {
    private int[] mSource;
    private int mStart;
    private int mLength;
    private int[] mDestination;

    private int mBlurWidth = 15; // Processing window size, should be odd.

    public ForkBlur(int[] src, int start, int length, int[] dst) {
        mSource = src;
        mStart = start;
        mLength = length;
        mDestination = dst;
    }

    public void run() {
        computeDirectly();
    }

    protected void computeDirectly() {
        // As in the example, omitted for brevity
    }
}

Split in the beginning and send tasks to a thread pool:

// source image pixels are in src
// destination image pixels are in dst
// threadPool is a (cached) thread pool

int maxSize = 100000; // analogous to F-J's "sThreshold"
List<Future> futures = new ArrayList<Future>();

// Send stuff to thread pool:
for (int i = 0; i < src.length; i+= maxSize) {
    int size = Math.min(maxSize, src.length - i);
    ForkBlur task = new ForkBlur(src, i, size, dst);
    Future f = threadPool.submit(task);
    futures.add(f);
}

// Wait for all sent tasks to complete:
for (Future future : futures) {
    future.get();
}

// Done!

The tasks go to the thread pool's queue, from which they're executed as worker threads become available. As long as the splitting is granular enough (to avoid having to particularly wait for the last task) and the thread pool has enough (at least N of processors) threads, all processors are working at full speed until the whole computation is done.

Am I missing something? What's the added value of using the fork/join framework?

解决方案

I think the basic misunderstanding is, that the Fork/Join examples do NOT show work stealing but only some kind of standard divide and conquer.

Work stealing would be like this: Worker B has finished his work. He is a kind one, so he looks around and sees Worker A still working very hard. He strolls over and asks: "Hey lad, I could give you a hand." A replies. "Cool, I have this task of 1000 units. So far I have finished 345 leaving 655. Could you please work on number 673 to 1000, I'll do the 346 to 672." B says "OK, let's start so we can go to the pub earlier."

You see - the workers must communicate between each other even when they started the real work. This is the missing part in the examples.

The examples on the other hand show only something like "use subcontractors":

Worker A: "Dang, I have 1000 units of work. Too much for me. I'll do 500 myself and subcontract 500 to someone else." This goes on until the big task is broken down into small packets of 10 units each. These will be executed by the available workers. But if one packet is a kind of poison pill and takes considerably longer than other packets -- bad luck, the divide phase is over.

The only remaining difference between Fork/Join and splitting the task upfront is this: When splitting upfront you have the work queue full right from start. Example: 1000 units, the threshold is 10, so the queue has 100 entries. These packets are distributed to the threadpool members.

Fork/Join is more complex and tries to keep the number of packets in the queue smaller:

  • Step 1: Put one packet containing (1...1000) into queue
  • Step 2: One worker pops the packet(1...1000) and replaces it with two packets: (1...500) and (501...1000).
  • Step 3: One worker pops packet (500...1000) and pushes (500...750) and (751...1000).
  • Step n: The stack contains these packets: (1..500), (500...750), (750...875)... (991..1000)
  • Step n+1: Packet (991..1000) is popped and executed
  • Step n+2: Packet (981..990) is popped and executed
  • Step n+3: Packet (961..980) is popped and split into (961...970) and (971..980). ....

You see: in Fork/Join the queue is smaller (6 in the example) and the "split" and "work" phases are interleaved.

When multiple workers are popping and pushing simultaneously the interactions are not so clear of course.

这篇关于fork / join框架如何比线程池更好?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆