Fortran OpenMP程序没有显示CPU_TIME()的加速 [英] Fortran OpenMP program shows no speedup of CPU_TIME()

查看:125
本文介绍了Fortran OpenMP程序没有显示CPU_TIME()的加速的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

并行性的使用应该会导致程序的时间最小化,但这并没有发生在我身上。当我使用OpenMP并行编程我的代码时,运行时会增加,即并行时间>序列时间。



我的代码:

  PROGRAM MAIN 
使用omp_lib
隐式无
REAL * 8 Times1,Times2
INTEGER I,J
real,allocatable,dimension(:) :: a
分配(a(1000))
DO J = 1,1000
a(j)= j
ENDDO
! ***************无并行代码******************************** ****
call CPU_TIME(Times1)
write(*,*)'CPU NO PARALLEL STARTED:',Times1
DO I = 1,1000
DO J = 1 ,500000
a(I)= a(I)+0.0001
end do
a(I)= a(I)+ a(I)+ a(I)
ENDDO
call CPU_TIME(Times2)
write(*,*)'CPU CPU NO PARALLEL finished:',Times2
write(*,*)'NO PARALLEL TIMES:',Times2-Times1
write(*,*)'----------------------------------------- ----------'
! ***************并行代码********************************* ***
调用CPU_TIME(Times1)
write(*,*)'CPU PARALLEL STARTED:',Times1
!$ OMP PARALLEL DEFAULT(shared),private(I,J)
!$ OMP DO
DO I = 1,1000
DO J = 1,500000
a(I)= a(I)+0.0001
end do
a(I)= a(I)+ a(I)+ a(I)
ENDDO
!$ OMP END DO
!$ OMP END PARALLEL
call CPU_TIME(Times2 )
write(*,*)'CPU PARALLEL finished:',Times2
write(*,*)'PARALLEL TIMES:',Times2-Times1
deallocate(a)
STOP
END

和结果:

  CPU无并行启动:1.560010000000000E-002 
CPU CPU无并行完成:4.86723120000000
无并行次数:4.85163110000000




$ p code> CPU并行启动:4.86723120000000
CPU PARALLEL完成:9.89046340000000
并行次数:5.0232322000 0000

为什么用CPU_TIME()测量的时间随OpenMP增加?

解决方案 cpu_time()占用CPU的时间,而不是挂墙时间。在并行应用程序中,这些不一样。详情请参阅此处



使用 system_clock() 解决了这个问题:

  PROGRAM MAIN 
use omp_lib
隐式无
REAL * 8 Times1,Times2
INTEGER I,J,iTimes1,iTimes2,rate
real,allocatable,dimension(:) :: a
分配(a(1000))

CALL system_clock(count_rate = rate)
DO J = 1,1000
a(j)= j
ENDDO
! ***************无并行代码******************************** ****
call CPU_TIME(Times1)
call SYSTEM_CLOCK(iTimes1)
write(*,*)'CPU NO PARALLEL STARTED:',Times1
DO I = 1, 1000
DO $ = 1 500000
a(I)= a(I)+0.0001
end do
a(I)= a(I)+ a(I)+ a (I)
ENDDO
call CPU_TIME(Times2)
call SYSTEM_CLOCK(iTimes2)
write(*,*)'CPU CPU NO PARALLEL finished:',Times2
(*,*)'NO PARALLEL TIMES:',Times2-Times1,real(iTimes2-iTimes1)/ real(rate)
write(*,*)'----------- ----------------------------------------'
! ***************并行代码********************************* ***
调用CPU_TIME(Times1)
调用SYSTEM_CLOCK(iTimes1)
write(*,*)'CPU PARALLEL STARTED:',Times1
!$ OMP PARALLEL DEFAULT(shared ),private(I,J)
!$ OMP DO
DO I = 1,1000
DO J = 1 500000
a(I)= a(I)+0.0001
end do
a(I)= a(I)+ a(I)+ a(I)
ENDDO
!$ OMP END DO
!$ OMP END PARALLEL
调用CPU_TIME(Times2)
调用SYSTEM_CLOCK(iTimes2)

write(*,*)'CPU PARALLEL finished:',Times2
write(*,* )'PARALLEL TIMES:',Times2-Times1,real(iTimes2-iTimes1)/ real(费率)
解除分配(a)
STOP
END

然后,您可以看到并行程序确实更快。

  CPU无并行启动:4.0000000000000001E-003 
CPU CPU无并行完成:1.4600000000000000
无并行TIMES:1.4560000000000000 1.45400000
------------------------------------------ ---------
CPU并行启动:1.4600000000000000
并行处理完毕:5.1040000000000001
并行时间:3.6440000000000001 0.920000017


The use of parallelism should lead to minimizing the time of a program but this did not happened to me. When I programmed my code in parallel using OpenMP, the run time is augmented, i.e. PARALLEL TIME > SERIAL TIME.

My code:

    PROGRAM MAIN
    use omp_lib
    implicit none
    REAL*8 Times1,Times2
    INTEGER I,J
    real, allocatable, dimension(:) :: a
    allocate(a(1000))
    DO J = 1, 1000
    a(j)=j  
    ENDDO
!    ***************NO PARALLEL CODE ************************************
    call CPU_TIME(Times1)
    write(*,*) 'CPU NO PARALLEL STARTED:',Times1
    DO I = 1, 1000
    DO J = 1, 500000
    a(I)=a(I)+0.0001
    end do 
    a(I)=a(I)+a(I)+a(I)
    ENDDO
    call CPU_TIME(Times2)
    write(*,*) 'CPU CPU NO PARALLEL finished:',Times2
    write(*,*) 'NO PARALLEL TIMES:',Times2-Times1
    write(*,*) '---------------------------------------------------'
!    ***************PARALLEL CODE ************************************
    call CPU_TIME(Times1)
    write(*,*) 'CPU PARALLEL STARTED:',Times1
!$OMP PARALLEL DEFAULT(shared), private(I,J)
!$OMP DO
    DO I = 1, 1000
    DO J = 1, 500000
    a(I)=a(I)+0.0001
    end do 
    a(I)=a(I)+a(I)+a(I)
    ENDDO
!$OMP END DO
!$OMP END PARALLEL
    call CPU_TIME(Times2)
    write(*,*) 'CPU PARALLEL finished:',Times2
    write(*,*) 'PARALLEL TIMES:',Times2-Times1
    deallocate(a)
    STOP
    END

and the result :

 CPU NO PARALLEL STARTED:  1.560010000000000E-002
 CPU CPU NO PARALLEL finished:   4.86723120000000
 NO PARALLEL TIMES:   4.85163110000000


 CPU PARALLEL STARTED:   4.86723120000000
 CPU PARALLEL finished:   9.89046340000000
 PARALLEL TIMES:   5.02323220000000

Why is my time measured by CPU_TIME() increased with OpenMP?

解决方案

cpu_time() takes the time on the CPU, not the walltime. In parallel applications these are not the same. See here for details.

Using system_clock() solves this problem:

    PROGRAM MAIN
    use omp_lib
    implicit none
    REAL*8 Times1,Times2
    INTEGER I,J, iTimes1,iTimes2, rate
    real, allocatable, dimension(:) :: a
    allocate(a(1000))

    CALL system_clock(count_rate=rate)
    DO J = 1, 1000
    a(j)=j  
    ENDDO
!    ***************NO PARALLEL CODE ************************************
    call CPU_TIME(Times1)
    call SYSTEM_CLOCK(iTimes1)
    write(*,*) 'CPU NO PARALLEL STARTED:',Times1
    DO I = 1, 1000
    DO J = 1, 500000
    a(I)=a(I)+0.0001
    end do 
    a(I)=a(I)+a(I)+a(I)
    ENDDO
    call CPU_TIME(Times2)
    call SYSTEM_CLOCK(iTimes2)
    write(*,*) 'CPU CPU NO PARALLEL finished:',Times2
    write(*,*) 'NO PARALLEL TIMES:',Times2-Times1, real(iTimes2-iTimes1)/real(rate)
    write(*,*) '---------------------------------------------------'
!    ***************PARALLEL CODE ************************************
    call CPU_TIME(Times1)
    call SYSTEM_CLOCK(iTimes1)
    write(*,*) 'CPU PARALLEL STARTED:',Times1
!$OMP PARALLEL DEFAULT(shared), private(I,J)
!$OMP DO
    DO I = 1, 1000
    DO J = 1, 500000
    a(I)=a(I)+0.0001
    end do 
    a(I)=a(I)+a(I)+a(I)
    ENDDO
!$OMP END DO
!$OMP END PARALLEL
    call CPU_TIME(Times2)
    call SYSTEM_CLOCK(iTimes2)

    write(*,*) 'CPU PARALLEL finished:',Times2
    write(*,*) 'PARALLEL TIMES:',Times2-Times1, real(iTimes2-iTimes1)/real(rate)
    deallocate(a)
    STOP
    END

Then, you can see that the parallel program is indeed faster.

 CPU NO PARALLEL STARTED:   4.0000000000000001E-003
 CPU CPU NO PARALLEL finished:   1.4600000000000000     
 NO PARALLEL TIMES:   1.4560000000000000        1.45400000    
 ---------------------------------------------------
 CPU PARALLEL STARTED:   1.4600000000000000     
 CPU PARALLEL finished:   5.1040000000000001     
 PARALLEL TIMES:   3.6440000000000001       0.920000017  

这篇关于Fortran OpenMP程序没有显示CPU_TIME()的加速的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆