Fortran内在定时例程,哪个更好? cpu_time或system_clock [英] Fortran intrinsic timing routines, which is better? cpu_time or system_clock
问题描述
在计算FORTRAN程序时,我通常只使用命令调用cpu_time(t)
。
然后我偶然发现调用system_clock([count,count_rate,count_max])
这似乎是做同样的事情。然而,在一个更困难的庄园。
我对这些知识的了解来自:旧英特尔文档。
我无法在英特尔主页上找到它。
- 哪一个更准确,还是它们相似?
- 其中一个计算缓存未命中(或其他类型),另一个没有,或者做其中的任何一个?
- 或者唯一的区别在于我的标记中的标记事项? / li>
这些是我的问题,下面我提供了一个代码供您查看一些时间和用法。他们告诉我他们在输出方面非常相似,因此在执行时似乎很相似。
我应该注意到,我可能会始终坚持使用 cpu_time
,并且我并不需要更精确的计时。
在下面的代码中,我试图对它们进行比较。 (我也尝试过更复杂的事情,但不会提供以保持简洁)
所以基本上我的结果是:
cpu_time
- 更容易使用,您不需要初始化调用
- 不同的直接时间
- 还应该是编译器特定的,但无法查看精度。 (标准为毫秒)
- 是线程时间的总和。即不推荐用于并行运行。
- 需要预初始化。
- 后期处理,采用分水岭的形式。 (小事情,但仍然有区别)
- 是编译器特定的。在我的电脑上发现了以下内容:
- Intel 12.0.4 由于
INTEGER,计数率为10000
- gcc-4.4.5 使用1000,不知道如何区分
- Intel 12.0.4 由于
- 很容易遇到环绕,即如果
c1>由于
count_max
- 是来自一个标准时间的时间,因此c2 因此,这将产生一个线程的实际时间而不是总和。
-
Intel 12.0.4
with-O0
system_clock速率10000.00
system_clock:2.389600
cpu_time:2.384033
sc< ct:1 of 1000
mean diff:4.2409324E-03
abs平均值差异:4.2409897E-03
真实42m5.340s
用户41m48.869s
sys 0m12.233s
-
gcc-4.4.5
with-O0
system_clock rate 1000.0000
system_clock:1.1849999
cpu_time:1.1840820
sc< ct:1000
中的275个差异:2.05709646E-03
abs平均值差异:2.71424348E-03
实际19m45.351s
用户19m42.954s
sys 0m0.348s
- Which is the more accurate, or are they similar?
- Do one of them count cache misses (or other of the sorts) and the other not, or do any of them?
- Or is the only difference being the marked thing in my markup below?
cpu_time
- Is easier to use, you don't need the initialization calls
- Direct time in a difference
- Should also be compiler specific, but there is no way to see the precision. (the norm is milliseconds)
- Is sum of thread time. I.e. not recommended for parallel runs.
system_clock
- Needs pre-initialization.
- After-process, in form of a divide. (small thing, but nonetheless a difference)
- Is compiler specific. On my PC the following was found:
- Intel 12.0.4 uses a count rate of 10000, due to the
INTEGER
precision. - gcc-4.4.5 uses 1000, do not know how this differentiates
- Intel 12.0.4 uses a count rate of 10000, due to the
- Is prone to encounter wraparounds, i.e. if
c1 > c2
, due tocount_max
- Is time from one standard time. Thus this will yield the actual time of one thread and not the sum.
Intel 12.0.4
with-O0
system_clock rate 10000.00 system_clock : 2.389600 cpu_time : 2.384033 sc < ct : 1 of 1000 mean diff : 4.2409324E-03 abs mean diff: 4.2409897E-03 real 42m5.340s user 41m48.869s sys 0m12.233s
gcc-4.4.5
with-O0
system_clock rate 1000.0000 system_clock : 1.1849999 cpu_time : 1.1840820 sc < ct : 275 of 1000 mean diff : 2.05709646E-03 abs mean diff: 2.71424348E-03 real 19m45.351s user 19m42.954s sys 0m0.348s
代码:
程序计时器
IMPLICIT NONE
REAL :: t1,t2,rate
INTEGER :: c1,c2,cr,cm,i,j,n,s
INTEGER,PARAMETER :: x = 20000,y = 15000,runs = 1000
REAL :: array (x,y),a_diff,diff
!首先初始化system_clock
CALL system_clock(count_rate = cr)
CALL system_clock(count_max = cm)
rate = REAL(cr)
WRITE(*,*)system_clock rate ,费率
diff = 0.0
a_diff = 0.0
s = 0
DO n = 1,运行
CALL CPU_TIME(t1)
CALL SYSTEM_CLOCK(c1)
FORALL(i = 1:x,j = 1:y)
array(i,j)= REAL(i)* REAL(j)+ 2
END FORALL
CALL CPU_TIME(t2)
CALL SYSTEM_CLOCK(c2)
array(1,1)= array(1,2)
IF((c2-c1)/ rate< (t2-t1))s = s + 1
diff =(c2-c1)/ rate - (t2-t1)+ diff
a_diff = ABS((c2 - c1)/ rate - (t2 -t1))+ a_diff
END DO
WRITE(*,*)system_clock:,(c2 - c1)/ rate
WRITE(*,*)cpu_time :,(t2-t1)
WRITE(*,*)sc WRITE(*,*)mean diff:,diff / runs
WRITE(*,*)abs mean diff:,a_diff / runs
END PROGRAM计时器
感谢阅读...
这两个内部函数报告不同类型的时间。 system_clock报告挂墙时间或经过的时间。 cpu_time报告CPU使用的时间。在多任务机器上,这些可能会非常不同,例如,如果您的进程与其他三个进程共享CPU并因此接收到25%的CPU并使用了10个CPU时间,则实际需要大约40秒的时间或墙时钟时间。
When timing a FORTRAN program i usually just use the command call cpu_time(t)
.
Then i stumbled across call system_clock([count,count_rate,count_max])
which seems to do the same thing. However, in a more difficult manor.
My knowledge of these come from: Old Intel documentation.
I wasn't able to find it on Intel's homepage. See my markup below.
Those are my questions, below i have supplied a code for you to see some timings and usages. They have showed me that they are very similar in output and thus seem to be similar in implementation.
I should note that i will probably always stick with cpu_time
, and that i don't really need more precise timings.
In the below code i have tried to compare them. (i have also tried more elaborate things, but will not supply in order to keep brevity) So basically my result is that:
Code:
PROGRAM timer
IMPLICIT NONE
REAL :: t1,t2,rate
INTEGER :: c1,c2,cr,cm,i,j,n,s
INTEGER , PARAMETER :: x=20000,y=15000,runs=1000
REAL :: array(x,y),a_diff,diff
! First initialize the system_clock
CALL system_clock(count_rate=cr)
CALL system_clock(count_max=cm)
rate = REAL(cr)
WRITE(*,*) "system_clock rate ",rate
diff = 0.0
a_diff = 0.0
s = 0
DO n = 1 , runs
CALL CPU_TIME(t1)
CALL SYSTEM_CLOCK(c1)
FORALL(i = 1:x,j = 1:y)
array(i,j) = REAL(i)*REAL(j) + 2
END FORALL
CALL CPU_TIME(t2)
CALL SYSTEM_CLOCK(c2)
array(1,1) = array(1,2)
IF ( (c2 - c1)/rate < (t2-t1) ) s = s + 1
diff = (c2 - c1)/rate - (t2-t1) + diff
a_diff = ABS((c2 - c1)/rate - (t2-t1)) + a_diff
END DO
WRITE(*,*) "system_clock : ",(c2 - c1)/rate
WRITE(*,*) "cpu_time : ",(t2-t1)
WRITE(*,*) "sc < ct : ",s,"of",runs
WRITE(*,*) "mean diff : ",diff/runs
WRITE(*,*) "abs mean diff: ",a_diff/runs
END PROGRAM timer
To complete i here give the output from my Intel 12.0.4 and gcc-4.4.5 compiler.
Thanks for reading...
These two intrinsics report different types of time. system_clock reports "wall time" or elapsed time. cpu_time reports time used by the CPU. On a multi-tasking machine these could be very different, e.g., if your process shared the CPU equally with three other processes and therefore received 25% of the CPU and used 10 cpu seconds, it would take about 40 seconds of actual elapsed or wall clock time.
这篇关于Fortran内在定时例程,哪个更好? cpu_time或system_clock的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!