叉加入优化 [英] Fork Join optimization

查看:169
本文介绍了叉加入优化的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想要的



我想要优化fork / join算法。通过优化我的意思是只计算最佳线程数,或者如果你想 - 计算 SEQUENTIAL_THRESHOLD (见下面的代码)。

  // PSEUDOCODE 
结果解决(问题问题){
if(problem.size< SEQUENTIAL_THRESHOLD)
return solveSequentially
else {
结果left,right;
INVOKE-IN-PARALLEL {
left = solve(extractLeftHalf(problem));
right = solve(extractRightHalf(problem));
}
return combine(left,right);
}
}

我如何想象



例如,我想计算大数组的乘积。然后我只是评估所有组件并获得最佳线程数量:



SEQUENTIAL_THRESHOLD = PC * IS / MC 示例)



PC - 处理器核数;
is - 常量,表示具有一个处理器核心的最佳数组大小,以及对数据(例如读取)的最简单操作。
MC - 乘以运算成本;



假设MC = 15; PC = 4和IS = 10000; SEQUENTIAL_THRESHOLD = 2667 。如果子任务数组大于2667,我会分叉。



广泛的问题


  1. 是否可以用这种方式生成SEQUENTIAL_THRESHOLD公式?

  2. 可以为更复杂的计算完成同样的操作:数组/集合和排序?

b

已经存在一些关于计算 SEQUENTIAL_THRESHOLD 的数组/集合/排序的调查?



更新日期:2014年3月7日


  • 如果没有办法为阈值计算编写单个公式,我可以编写一个util,它将在PC上执行预定义测试,而不是获得最优阈值?这是不可能还是不可能?

  • Java 8 Streams API有什么作用?它能帮助我吗? Java 8 Streams API消除了Fork / Join中的需要?


  • 解决方案

    肯定没有办法计算一个适当的阈值,除非你与执行环境亲密。我在sourceforge.net上维护一个fork / join项目,这是我在大多数内置函数中使用的代码:

      private int calcThreshold(int nbr_elements,int passed_threshold){

    //会话中的线程总数
    //数组中的元素总数
    int threads = getNbrThreads();
    int count = nbr_elements + 1;

    //当只有一个线程时,它不支付分解工作,
    //强制超过数组长度的阈值
    if(threads == 1)return count ;

    / *
    *无论如何
    *
    * /
    int threshold = passed_threshold;

    //当呼叫者建议一个值
    if(threshold> 0){

    //只是跟随呼叫者的建议或做一些建议

    } else {
    //做一些有用的事情,比如使用线程的任务数量的8倍或
    //默认为32k
    int temp = count / (线<< 3);
    threshold =(temp <32768)? 32768:temp;

    } // endif

    //无论
    返回阈值;

    }

    3月9日编辑:



    你怎么能有一个通用的实用程序,不仅知道处理器速度,可用内存,处理器数量等(物理环境),而且知道软件的意图吗?答案是你不能。这就是为什么你需要为每个环境开发一个例程。上面的方法是我用于基本数组(向量。)我使用另一个大多数矩阵处理:

     小,只是散布每一行
    if(count< 6)return 1;

    //小的时候传播一点
    if(count< 30)return((count /(threads<< 2)== 0)?threads: (线<<< 2)));

    //这对现在很好
    return((count /(threads<< 3)== 0)?threads:(count /(threads< ));

    对于Java8流:他们使用F / J框架在底层,你不能指定阈值。


    What I want

    I want to work on optimization of fork/join algorithm. By optimization I mean just calculation of optimal number of threads, or if you want - calculation of SEQUENTIAL_THRESHOLD (see code below).

    // PSEUDOCODE
    Result solve(Problem problem) { 
        if (problem.size < SEQUENTIAL_THRESHOLD)
            return solveSequentially(problem);
        else {
            Result left, right;
            INVOKE-IN-PARALLEL { 
                left = solve(extractLeftHalf(problem));
                right = solve(extractRightHalf(problem));
            }
            return combine(left, right);
        }
    }
    

    How do I imagine that

    For example, I want to calculate the product of big array. Then I just evaluate all components and get the optimal threads amount:

    SEQUENTIAL_THRESHOLD = PC * IS / MC (just example)

    PC - number of processor cores; IS - constant, that indicates the optimal array size with one processor core and the simplest operation on data (for example reading); MC - multiply operation cost;

    Suppose MC = 15; PC = 4 and IS = 10000; SEQUENTIAL_THRESHOLD = 2667. Than if subtask-array is bigger than 2667 I'll fork it.

    Broad questions

    1. Is it possible to make SEQUENTIAL_THRESHOLD formula in such way?
    2. Is it possible to accomplish the same for more complex computation: not only for operations on arrays/collections and sorting?

    Narrow question:

    Do already exist some investigations about calculation of SEQUENTIAL_THRESHOLD for arrays/collections/sorting? How do they accomplish that?

    Updated 07 March 2014:

    1. If there is no way to write a single formula for threshold calculation, can I write an util which will perform predefined tests on PC, and than gets the optimal threshold? Is that also impossible or not?
    2. What can Java 8 Streams API do? Can it help me? Does Java 8 Streams API eliminate a need in Fork/Join?

    解决方案

    There is absolutely, positively no way to calculate a proper threshold unless you are intimate with the execution environment. I maintain a fork/join project on sourceforge.net and this is the code I use in most built-in-functions:

    private int calcThreshold(int nbr_elements, int passed_threshold) {
    
      // total threads in session
      // total elements in array
      int threads = getNbrThreads();
      int count   = nbr_elements + 1;
    
      // When only one thread, it doesn't pay to decompose the work,
      //   force the threshold over array length
      if  (threads == 1) return count;    
    
      /*
       * Whatever it takes
       * 
       */
      int threshold = passed_threshold;
    
      // When caller suggests a value
      if  (threshold > 0) {
    
          // just go with the caller's suggestion or do something with the suggestion
    
      } else {
          // do something usful such as using about 8 times as many tasks as threads or
          //   the default of 32k
          int temp = count / (threads << 3);
          threshold = (temp < 32768) ? 32768 : temp;
    
      } // endif    
    
      // whatever
      return threshold;
    
    }
    

    Edit on 9 March:

    How can you possibly have a general utility that can know not only the processor speed, memory available, number of processors, etc. (the physical environment) but the intention of the software? The answer is you cannot. Which is why you need to develop a routine for each environment. The above method is what I use for basic arrays (vectors.) I use another for most matrix processing:

    // When very small, just spread every row
    if  (count < 6) return 1;
    
    // When small, spread a little 
    if  (count < 30) return ((count / (threads << 2) == 0)? threads : (count / (threads << 2)));  
    
    // this works well for now
    return ((count / (threads << 3) == 0)? threads : (count / (threads << 3))); 
    

    As far as Java8 streams: They use the F/J framework under the hood and you cannot specify a threshold.

    这篇关于叉加入优化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆