Python多处理脚本中的打印时间返回经过的负时间 [英] Printing time in Python multiprocessing script return negative time elapsed
问题描述
在 Ubuntu 14 上使用 Python 2.7.6
我简化了脚本来显示我的问题:
I simplified script to show my problem:
import time
import multiprocessing
data = range(1, 3)
start_time = time.clock()
def lol():
for i in data:
print time.clock() - start_time, "lol seconds"
def worker(n):
print time.clock() - start_time, "multiprocesor seconds"
def mp_handler():
p = multiprocessing.Pool(1)
p.map(worker, data)
if __name__ == '__main__':
lol()
mp_handler()
输出:
8e-06 lol seconds
6.9e-05 lol seconds
-0.030019 multiprocesor seconds
-0.029907 multiprocesor seconds
Process finished with exit code 0
使用time.time()
给出非负值(如此处标记计时器显示负的消逝时间)
但是我很好奇python multiprocessing
中的time.clock()
和从CPU读取时间有什么问题.
Using time.time()
gives non-negative values (as marked here Timer shows negative time elapsed)
but I'm curious what is the problem with time.clock()
in python multiprocessing
and reading time from CPU.
推荐答案
multiprocessing
产生 new 进程和 clock()
相同的含义:
multiprocessing
spawns new processes and time.clock()
on linux has the same meaning of the C's clock()
:
返回的值是迄今使用的 CPU时间作为clock_t;
The value returned is the CPU time used so far as a clock_t;
因此,当进程启动时,clock
返回的值将从0
重新开始.但是,您的代码使用父级的进程start_time
确定在子进程中花费的时间,如果子进程重置了CPU时间,这显然是不正确的.
So the values returned by clock
restart from 0
when a process start. However your code uses the parent's process start_time
to determine the time spent in the child process, which is obviously incorrect if the child processes CPU time resets.
在处理一个进程时,clock()
函数仅对有意义,因为它的返回值是 进程所花费的CPU时间.不考虑子进程.
The clock()
function makes sense only when handling one process, because its return value is the CPU time spent by that process. Child processes are not taken into account.
另一方面,time()
函数使用系统范围的时钟,因此即使在不同的进程之间也可以使用(尽管它不是单调的,所以如果有人将其返回错误的结果,更改事件期间的系统时间.
The time()
function on the other hand uses a system-wide clock, and thus can be used even between different processes (although it is not monotonic, so it might return wrong results if somebody changes the system time during the events).
分叉一个正在运行的python实例可能比从头开始新建一个python实例更快,因此start_time
几乎总是比time.clock()
返回的值更大.
考虑到父进程还必须读取磁盘上的文件,执行导入,这可能需要读取其他.py
文件,搜索目录等.
分叉的子进程不必执行所有操作.
Forking a running python instance is probably faster then starting a new one from scratch, hence start_time
is almost always bigger then the value returned by time.clock()
.
Take into account that the parent process also had to read your file on disk, perform the imports which may require reading other .py
files, searching directories etc.
The forked child processes don't have to do all that.
示例代码显示time.clock()
的返回值重置为0
:
Example code that shows that the return value of time.clock()
resets to 0
:
from __future__ import print_function
import time
import multiprocessing
data = range(1, 3)
start_time = time.clock()
def lol():
for i in data:
t = time.clock()
print('t: ', t, end='\t')
print(t - start_time, "lol seconds")
def worker(n):
t = time.clock()
print('t: ', t, end='\t')
print(t - start_time, "multiprocesor seconds")
def mp_handler():
p = multiprocessing.Pool(1)
p.map(worker, data)
if __name__ == '__main__':
print('start_time', start_time)
lol()
mp_handler()
结果:
$python ./testing.py
start_time 0.020721
t: 0.020779 5.8e-05 lol seconds
t: 0.020804 8.3e-05 lol seconds
t: 0.001036 -0.019685 multiprocesor seconds
t: 0.001166 -0.019555 multiprocesor seconds
请注意t
在lol
情况下如何单调,而在其他情况下又回到0.001
.
Note how t
is monotonic for the lol
case while goes back to 0.001
in the other case.
这篇关于Python多处理脚本中的打印时间返回经过的负时间的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!