在C ++中使用OpenMP并行化递归函数 [英] Parallelizing recursive function using OpenMP in C++

查看:565
本文介绍了在C ++中使用OpenMP并行化递归函数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下要使用OpenMP并行化的递归程序:

I have the following recursive program which I would like to parallelize using OpenMP:

#include <iostream>
#include <cmath>
#include <numeric>
#include <vector>
#include <algorithm>
#include <thread>
#include <omp.h>


// Determines if a point of dimension point.size() is within the sphere
bool isPointWithinSphere(std::vector<int> point, const double &radius) {

    // Since we know that the sphere is centered at the origin, we can simply
    // find the euclidean distance (square root of the sum of squares) and check to
    // see if it is less than or equal to the length of the radius 

    //square each element inside the point vector
    std::transform(point.begin(), point.end(), point.begin(), [](auto &x){return std::pow(x,2);});

    //find the square root of the sum of squares and check if it is less than or equal to the radius
    return std::sqrt(std::accumulate(point.begin(), point.end(), 0, std::plus<int>())) <= radius;    
}

// Counts the number of lattice points inside the sphere( all points (x1 .... xn) such that xi is an integer )

// The algorithm: If the radius is a floating point value, first find the floor of the radius and cast it to 
// an integer. For example, if the radius is 2.43 then the only integer points we must check are those between
// -2 and 2. We generate these points by simulating n - nested loops using recursion and passing each point
// in to the boolean function isPointWithinSphere(...), if the function returns true, we add one to the count
// (we have found a lattice point on the sphere). 

int countLatticePoints(std::vector<int> point, const double radius, const int dimension, int count = 0) {

    const int R = static_cast<int>(std::floor(radius));

    #pragma omp parallel for
    for(int i = -R; i <= R; i++) {
        point.push_back(i);

        if(point.size() == dimension){
            if(isPointWithinSphere(point, radius)) count++;
        }else count = countLatticePoints(point, radius, dimension, count);

        point.pop_back();

    }

    return count;
}

int main(int argc, char ** argv) {
    std::vector<int> vec;

    #pragma omp parallel
    std::cout << countLatticePoints(vec, 5, 7) << std::endl;   

    return 0;
}

我尝试在main函数中添加一个并行区域,以及在countLatticePoints中并行化for循环,但是我几乎看不到并行化与顺序运行算法之间的任何改进. 根据我可以使用的其他OpenMP策略,可以提供任何帮助/建议.

I have tried adding a parallel region within the main function as well as parallelizing the for loop within countLatticePoints yet I see hardly any improvement gained from parallelizing vs running the algorithm sequentially. Any help / advice would be appreciated in terms of other OpenMP strategies that I can use.

推荐答案

我将采用建议路线.在尝试使用线程提高程序速度之前,您首先要在单线程情况下提高程序速度.您可以进行一些改进.您正在复制很多点矢量,这会导致大量昂贵的内存分配.

I'll take the advice route. Before trying to make your program faster using threads, you first want to make it faster in the single threaded case. There are several improvements you can make. You're making a lot of copies of your point vectors, which incurs a lot of expensive memory allocations.

point传递给isPointWithinSphere作为参考.然后,不要使用两个循环,而要使用一个循环来平方并累加point中的每个元素.然后,在检查半径时,比较距离的平方而不是距离.这样可以避免调用sqrt并将其替换为简单的正方形.

Pass point into isPointWithinSphere as a reference. Then, rather than two loops, use one loop to square and accumulate the each element in point. Then, when checking the radius, compare the square of the distance rather than the distance. This avoids the sqrt call and replaces it with a simple square.

countLatticePoints也应作为参考引用point.每次调用递归时,都不必从dimension减去1,而是直接检查dimension == 1而不是计算大小.

countLatticePoints should also take point by reference. Rather than calling point.size(), subtract 1 from dimension each time you recurse, then just check for dimension == 1 instead of computing the size.

所有这些,如果您仍然希望/需要引入线程,由于需要逐个引用,因此您需要进行一些调整. countLatticePoint将需要具有两个变体,其中包含OpenMP伪指令的初始调用,以及不具有它们的递归变量.

With all that, if you still want/need to introduce threading, you'll need to make some adjustments due to passing point by reference. countLatticePoint will need to have two variants, the initial call that has the OpenMP directive in it, and the recursive one that does not have them.

main中的#pragma omp parallel不会执行任何操作,因为只有一小段代码可以执行.

The #pragma omp parallel in main won't do anything because there is only one block of code to execute.

这篇关于在C ++中使用OpenMP并行化递归函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆