使用openmp和private子句的梯形规则集成 [英] Trapezoidal rule integration using openmp and private clauses

查看：60 发布时间：2021/5/9 19:19:30 multithreading fortran openmp gfortran

本文介绍了使用openmp和private子句的梯形规则集成的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在更改用于串行执行的代码，将其调整为并行执行(openmp)，但是我对预期结果(pi值)的理解很差.我在下面显示两个代码.

I'm changing a code for serial execution adjusting it to parallel execution (openmp), but I get a bad aproximation of the desired result (pi value). I show both codes below.

有什么问题吗?

program trap
use omp_lib 
implicit none
double precision::suma=0.d0 ! sum is a scalar
double precision:: h,x,lima,limb
integer::n,i, istart, iend, thread_num, total_threads=4, ppt
integer(kind=8):: tic, toc, rate
double precision:: time
double precision, dimension(4):: pi= 0.d0

call system_clock(count_rate = rate)
call system_clock(tic)

lima=0.0d0; limb=1.0d0; suma=0.0d0; n=10000000
h=(limb-lima)/n

suma=h*(f(lima)+f(limb))*0.5d0 !first and last points

ppt= n/total_threads
!$ call omp_set_num_threads(total_threads)

!$omp parallel private (istart, iend, thread_num, i)
  thread_num = omp_get_thread_num()
  !$ istart = thread_num*ppt +1
  !$ iend = min(thread_num*ppt + ppt, n-1)
do i=istart,iend ! this will control the loop in different images
  x=lima+i*h
  suma=suma+f(x) 
  pi(thread_num+1)=suma
enddo
!$omp end parallel

suma=sum(pi) 
suma=suma*h

print *,"The value of pi is= ",suma ! print once from the first image
!print*, 'pi=' , pi
call system_clock(toc)
time = real(toc-tic)/real(rate)
print*, 'Time ', time, 's'

contains

double precision function f(y)
double precision:: y
f=4.0d0/(1.0d0+y*y)
end function f

end program trap

!----------------------------------------------------------------------------------
program trap
implicit none
double precision::sum ! sum is a scalar
double precision:: h,x,lima,limb
integer::n,i
integer(kind=8):: tic, toc, rate
double precision:: time

call system_clock(count_rate = rate)
call system_clock(tic)

lima=0.0d0; limb=1.0d0; sum=0.0d0; n=10000000
h=(limb-lima)/n

sum=h*(f(lima)+f(limb))*0.5d0 !first and last points

do i=1,n-1 ! this will control the loop in different images
  x=lima+i*h
  sum=sum+f(x)
enddo

sum=sum*h

print *,"The value of pi is (serial exe)= ",sum ! print once from the first image

call system_clock(toc)
time = real(toc-tic)/real(rate)
print*, 'Time serial execution', time, 's'

contains

double precision function f(y)
double precision:: y
f=4.0d0/(1.0d0+y*y)
end function f

end program trap

编译使用:

$ gfortran -fopenmp -Wall -Wextra -O2 -Wall -o prog.exe test.f90 
$ ./prog.exe

和

$ gfortran -Wall -Wextra -O2 -Wall -o prog.exe testserial.f90 
$ ./prog.exe

在串行执行中，我得到了pi(3.1415)的很好的近似，但是使用并行我得到了(我显示了几个并行执行):

In serial execution I get good aproximations of pi (3.1415) but using parallel I get (I show several parallel executions):

 The value of pi is=    3.6731101425922810     

 Time    3.3386986702680588E-002 s

-------------------------------------------------------

 The value of pi is=    3.1556004791445953     

 Time    8.3681479096412659E-002 s

------------------------------------------------------

 The value of pi is=    3.2505952856717966     

 Time    5.1473543047904968E-002 s

推荐答案

openmp并行语句中存在问题.您继续添加到变量 suma 上.因此，您需要指定 reduction 语句.另外，您没有将变量 x 指定为私有变量.

There is a problem in your openmp parallel statement. You keep on adding up onto the variable suma. Therefore, you need to specify a reduction statement. Also, you did not specify the variable x to be private.

我还更改了代码的其他部分

I also changed some more parts of your code

您已明确告知每个线程应使用的索引范围.多数情况下，编译器可以自己更有效地解决这一问题.为此，我将 parallel 更改为 parallel do .
优良作法是将openmp并行区域中的变量属性设置为 default(none).您将需要显式设置每个变量属性.

You explicitly told each thread which index range it should use. Most often the compiler can figure that out more efficiently by itself. I changed parallel to parallel do for that.
It is good practice to set variable attributes in the openmp parallel region to be default(none). You will need to set each variables attribute explicitly.

program trap
  use omp_lib
  implicit none
  double precision   :: suma,h,x,lima,limb, time
  integer            :: n, i
  integer, parameter :: total_threads=5
  integer(kind=8)    :: tic, toc, rate

  call system_clock(count_rate = rate)
  call system_clock(tic)

  lima=0.0d0; limb=1.0d0; suma=0.0d0; n=10000000
  h=(limb-lima)/n

  suma=h*(f(lima)+f(limb))*0.5d0 !first and last points

  call omp_set_num_threads(total_threads)
  !$omp parallel do default(none) private(i, x) shared(lima, h, n)  reduction(+: suma)
  do i = 1, n
    x=lima+i*h
    suma=suma+f(x)
  end do
  !$omp end parallel do

  suma=suma*h

  print *,"The value of pi is= ", suma ! print once from the first image
  call system_clock(toc)
  time = real(toc-tic)/real(rate)
  print*, 'Time ', time, 's'

contains

  double precision function f(y)
    double precision:: y
    f=4.0d0/(1.0d0+y*y)
  end function

end program

这篇关于使用openmp和private子句的梯形规则集成的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用openmp和private子句的梯形规则集成 [英] Trapezoidal rule integration using openmp and private clauses

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

使用openmp和private子句的梯形规则集成 [英] Trapezoidal rule integration using openmp and private clauses

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭