Python多处理脚本中的打印时间返回经过的负时间 [英] Printing time in Python multiprocessing script return negative time elapsed

查看:103
本文介绍了Python多处理脚本中的打印时间返回经过的负时间的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Ubuntu 14 上使用 Python 2.7.6

我简化了脚本来显示我的问题:

I simplified script to show my problem:

import time
import multiprocessing

data = range(1, 3)
start_time = time.clock()


def lol():
    for i in data:
        print time.clock() - start_time, "lol seconds"


def worker(n):
    print time.clock() - start_time, "multiprocesor seconds"


def mp_handler():
    p = multiprocessing.Pool(1)
    p.map(worker, data)

if __name__ == '__main__':
    lol()
    mp_handler()

输出:

8e-06 lol seconds
6.9e-05 lol seconds
-0.030019 multiprocesor seconds
-0.029907 multiprocesor seconds

    Process finished with exit code 0

使用time.time()给出非负值(如此处标记计时器显示负的消逝时间) 但是我很好奇python multiprocessing中的time.clock()和从CPU读取时间有什么问题.

Using time.time() gives non-negative values (as marked here Timer shows negative time elapsed) but I'm curious what is the problem with time.clock() in python multiprocessing and reading time from CPU.

推荐答案

multiprocessing产生 new 进程和 clock()相同的含义:

multiprocessing spawns new processes and time.clock() on linux has the same meaning of the C's clock():

返回的值是迄今使用的 CPU时间作为clock_t;

The value returned is the CPU time used so far as a clock_t;

因此,当进程启动时,clock返回的值将从0重新开始.但是,您的代码使用父级的进程start_time确定在子进程中花费的时间,如果子进程重置了CPU时间,这显然是不正确的.

So the values returned by clock restart from 0 when a process start. However your code uses the parent's process start_time to determine the time spent in the child process, which is obviously incorrect if the child processes CPU time resets.

在处理一个进程时,clock()函数仅对有意义,因为它的返回值是 进程所花费的CPU时间.不考虑子进程.

The clock() function makes sense only when handling one process, because its return value is the CPU time spent by that process. Child processes are not taken into account.

另一方面,time()函数使用系统范围的时钟,因此即使在不同的进程之间也可以使用(尽管它不是单调的,所以如果有人将其返回错误的结果,更改事件期间的系统时间.

The time() function on the other hand uses a system-wide clock, and thus can be used even between different processes (although it is not monotonic, so it might return wrong results if somebody changes the system time during the events).

分叉一个正在运行的python实例可能比从头开始新建一个python实例更快,因此start_time几乎总是比time.clock()返回的值更大. 考虑到父进程还必须读取磁盘上的文件,执行导入,这可能需要读取其他.py文件,搜索目录等. 分叉的子进程不必执行所有操作.

Forking a running python instance is probably faster then starting a new one from scratch, hence start_time is almost always bigger then the value returned by time.clock(). Take into account that the parent process also had to read your file on disk, perform the imports which may require reading other .py files, searching directories etc. The forked child processes don't have to do all that.

示例代码显示time.clock()的返回值重置为0:

Example code that shows that the return value of time.clock() resets to 0:

from __future__ import print_function

import time
import multiprocessing

data = range(1, 3)
start_time = time.clock()


def lol():
    for i in data:
        t = time.clock()
        print('t: ', t, end='\t')
        print(t - start_time, "lol seconds")


def worker(n):
    t = time.clock()
    print('t: ', t, end='\t')
    print(t - start_time, "multiprocesor seconds")


def mp_handler():
    p = multiprocessing.Pool(1)
    p.map(worker, data)

if __name__ == '__main__':
    print('start_time', start_time)
    lol()
    mp_handler()

结果:

$python ./testing.py 
start_time 0.020721
t:  0.020779    5.8e-05 lol seconds
t:  0.020804    8.3e-05 lol seconds
t:  0.001036    -0.019685 multiprocesor seconds
t:  0.001166    -0.019555 multiprocesor seconds

请注意tlol情况下如何单调,而在其他情况下又回到0.001.

Note how t is monotonic for the lol case while goes back to 0.001 in the other case.

这篇关于Python多处理脚本中的打印时间返回经过的负时间的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆