Fortran 内在计时例程,哪个更好?cpu_time 或 system_clock [英] Fortran intrinsic timing routines, which is better? cpu_time or system_clock
问题描述
在为 FORTRAN 程序计时时,我通常只使用命令 call cpu_time(t)
.
然后我偶然发现 call system_clock([count,count_rate,count_max])
似乎做同样的事情.然而,在一个更困难的庄园.我对这些的了解来自:旧的英特尔文档.
我无法在英特尔的主页上找到它.请参阅下面的标记.
When timing a FORTRAN program i usually just use the command call cpu_time(t)
.
Then i stumbled across call system_clock([count,count_rate,count_max])
which seems to do the same thing. However, in a more difficult manor.
My knowledge of these come from: Old Intel documentation.
I wasn't able to find it on Intel's homepage. See my markup below.
- 哪个更准确,或者它们相似?
- 其中一个计算缓存未命中(或其他类型)而另一个不计算,或者其中任何一个都计算?
- 或者,唯一的区别是下面我的标记中标记的东西?
这些是我的问题,下面我提供了一个代码供您查看一些时间和用法.他们向我展示了它们的输出非常相似,因此在实现上似乎也很相似.
我应该注意,我可能会一直坚持使用 cpu_time
,而且我真的不需要更精确的计时.
Those are my questions, below i have supplied a code for you to see some timings and usages. They have showed me that they are very similar in output and thus seem to be similar in implementation.
I should note that i will probably always stick with cpu_time
, and that i don't really need more precise timings.
在下面的代码中,我试图比较它们.(我也尝试过更复杂的东西,但为了简洁起见不会提供)所以基本上我的结果是:
In the below code i have tried to compare them. (i have also tried more elaborate things, but will not supply in order to keep brevity) So basically my result is that:
cpu_time
- 更容易使用,不需要初始化调用
- 不同的直接时间
- 也应该是特定于编译器的,但无法看到精度.(规范是毫秒)
- 是线程时间的总和.IE.不推荐用于并行运行.
system_clock
- 需要预初始化.
- 后处理,以划分的形式.(小事,但还是有区别的)
- 特定于编译器.在我的电脑上发现了以下内容:
- Intel 12.0.4 使用 10000 的计数率,因为
INTEGER
精度. - gcc-4.4.5用了1000个,不知道这个怎么区分
- Intel 12.0.4 使用 10000 的计数率,因为
- Needs pre-initialization.
- After-process, in form of a divide. (small thing, but nonetheless a difference)
- Is compiler specific. On my PC the following was found:
- Intel 12.0.4 uses a count rate of 10000, due to the
INTEGER
precision. - gcc-4.4.5 uses 1000, do not know how this differentiates
- Intel 12.0.4 uses a count rate of 10000, due to the
代码:
PROGRAM timer
IMPLICIT NONE
REAL :: t1,t2,rate
INTEGER :: c1,c2,cr,cm,i,j,n,s
INTEGER , PARAMETER :: x=20000,y=15000,runs=1000
REAL :: array(x,y),a_diff,diff
! First initialize the system_clock
CALL system_clock(count_rate=cr)
CALL system_clock(count_max=cm)
rate = REAL(cr)
WRITE(*,*) "system_clock rate ",rate
diff = 0.0
a_diff = 0.0
s = 0
DO n = 1 , runs
CALL CPU_TIME(t1)
CALL SYSTEM_CLOCK(c1)
FORALL(i = 1:x,j = 1:y)
array(i,j) = REAL(i)*REAL(j) + 2
END FORALL
CALL CPU_TIME(t2)
CALL SYSTEM_CLOCK(c2)
array(1,1) = array(1,2)
IF ( (c2 - c1)/rate < (t2-t1) ) s = s + 1
diff = (c2 - c1)/rate - (t2-t1) + diff
a_diff = ABS((c2 - c1)/rate - (t2-t1)) + a_diff
END DO
WRITE(*,*) "system_clock : ",(c2 - c1)/rate
WRITE(*,*) "cpu_time : ",(t2-t1)
WRITE(*,*) "sc < ct : ",s,"of",runs
WRITE(*,*) "mean diff : ",diff/runs
WRITE(*,*) "abs mean diff: ",a_diff/runs
END PROGRAM timer
为了完成,我在这里给出了我的 Intel 12.0.4 和 gcc-4.4.5 编译器的输出.
To complete i here give the output from my Intel 12.0.4 and gcc-4.4.5 compiler.
Intel 12.0.4
带有-O0
system_clock rate 10000.00
system_clock : 2.389600
cpu_time : 2.384033
sc < ct : 1 of 1000
mean diff : 4.2409324E-03
abs mean diff: 4.2409897E-03
real 42m5.340s
user 41m48.869s
sys 0m12.233s
gcc-4.4.5
和 -O0
system_clock rate 1000.0000
system_clock : 1.1849999
cpu_time : 1.1840820
sc < ct : 275 of 1000
mean diff : 2.05709646E-03
abs mean diff: 2.71424348E-03
real 19m45.351s
user 19m42.954s
sys 0m0.348s
感谢阅读...
推荐答案
这两个内部函数报告不同类型的时间.system_clock 报告墙上时间"或经过的时间.cpu_time 报告 CPU 使用的时间.在多任务机器上,这些可能会有很大不同,例如,如果您的进程与其他三个进程平均共享 CPU,因此获得 25% 的 CPU 并使用 10 cpu 秒,则实际经过或墙大约需要 40 秒时钟时间.
These two intrinsics report different types of time. system_clock reports "wall time" or elapsed time. cpu_time reports time used by the CPU. On a multi-tasking machine these could be very different, e.g., if your process shared the CPU equally with three other processes and therefore received 25% of the CPU and used 10 cpu seconds, it would take about 40 seconds of actual elapsed or wall clock time.
这篇关于Fortran 内在计时例程,哪个更好?cpu_time 或 system_clock的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!