测量外部程序使用的时间,内存量和cpu [英] measure elapsed time, amount of memory and cpu used by the extern program

查看:116
本文介绍了测量外部程序使用的时间,内存量和cpu的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在通过Python执行外部程序.我想知道使用subprocess.Popen()subprocess.call()调用外部程序的最佳选择是什么.另外,我需要测量经过时间,外部程序使用的内存和CPU数量.我听说过psutil,但是我真的不知道该选择哪个.

解决方案

我还需要测量外部程序使用的时间,内存量和cpu

(我假设您仅需要平台的 rusage .而且,由于Windows根本没有这样的东西,因此我还要假设您根本不关心Windows.如果您需要仅以特定于平台的方式提供的其他信息(请阅读Linux的proc文件系统,或调用AIX的监视器API,等等),您几乎无法使用stdlib做到这一点,而psutil答案是唯一的.)

subprocess库结束了对 fork ,然后在子级中添加 execv -family函数以及父级中的 waitpid -family函数. (您可以从 call 并从那里追溯到其他呼叫.)

不幸的是,从孩子那里获得资源使用率的简单方法是调用wait3 wait4 ,而不是waitwaitpid.因此,subprocess使您疯狂地接近您想要的东西,但还远远不够.

但是您有一些选择:

  • 如果您只有一个子进程,请 getrusage(RUSAGE_CHILDREN) 就是您所需要的.
  • 您可以启动该过程,然后使用psutil(或特定于平台的代码)从proc.pid获取资源信息,然后再收割孩子.
  • 您可以使用psutil进行所有操作,而将subprocess留在后面.
  • 您可以将subprocess.Popen子类化以覆盖其wait方法.

最后一个比听起来简单得多.如果您查看源代码,实际上只有3个地方调用os.waitpid,并且其中只有一个会影响代码.我认为它是 _try_wait 中的一个.所以(未经测试):

class ResourcePopen(subprocess.Popen):
    def _try_wait(self, wait_flags):
        """All callers to this function MUST hold self._waitpid_lock."""
        try:
            (pid, sts, res) = _eintr_retry_call(os.wait4, self.pid, wait_flags)
        except OSError as e:
            if e.errno != errno.ECHILD:
                raise
            # This happens if SIGCLD is set to be ignored or waiting
            # for child processes has otherwise been disabled for our
            # process.  This child is dead, we can't get the status.
            pid = self.pid
            sts = 0
        else:
            self.rusage = res
        return (pid, sts)

def resource_call(*popenargs, timeout=None, **kwargs):
    """Run command with arguments.  Wait for command to complete or
    timeout, then return the returncode attribute and resource usage.

    The arguments are the same as for the Popen constructor.  Example:

    retcode, rusage = call(["ls", "-l"])
    """
    with ResourcePopen(*popenargs, **kwargs) as p:
        try:
            retcode = p.wait(timeout=timeout)
            return retcode, p.rusage
        except:
            p.kill()
            p.wait()
            raise

现在:

retcode, rusage = resource_call(['spam', 'eggs'])
print('spam used {}s of system time'.format(rusage.ru_stime))

比较将其与psutil混合使用(在许多平台上以这种方式使用时什至不起作用……)

p = subprocess.Popen(['spam', 'eggs'])
ps = psutil.Process(p.pid)
p.wait()
print('spam used {}s of system time'.format(ps.cpu_times().system))

当然,后者并不是没有充分的理由而变得更复杂,而是更加复杂,因为它整体上更加强大和灵活.您还可以获得rusage不包括的所有数据,并且您可以在进程运行时每秒获取信息,而不必等到完成为止,并且可以在Windows上使用它,等等……

I'm executing an external program through Python. I want to know what is the best choice for calling the outside program, with subprocess.Popen() or with subprocess.call(). Also, I need to measure elapsed time, the amount of memory and CPU used by the external program. I've heard of psutil, but I don't really know which to choose.

解决方案

also I need to measure elapsed time, amount of memory and cpu used by the extern program

(I'm going to assume you only need the information available in your platform's rusage. And, since Windows has no such thing at all, I'm also going to assume you don't care about Windows. If you need additional information that's only available in some platform-specific way (reading out of Linux's proc filesystem, or calling AIX's monitor APIs, or whatever), you pretty much can't do this with the stdlib, and the psutil answer is the only one.)

The subprocess library wraps up calling fork, then an execv-family function in the child and a waitpid-family function in the parent. (You can see this by starting with the source to call and tracing down into the other calls from there.)

Unfortunately, the easy way to get resource usage from a child is to call wait3 or wait4, not wait or waitpid. So subprocess gets you maddeningly close to what you want, but not quite there.

But you've got a few options:

  • If you only have one child process, getrusage(RUSAGE_CHILDREN) is all you need.
  • You can launch the process, then use psutil (or platform-specific code) to get resource information from proc.pid before reaping the child.
  • You can use psutil to do everything, leaving subprocess behind.
  • You can subclass subprocess.Popen to override its wait method.

The last one is a lot simpler than it sounds. If you look at the source, there are only 3 places where os.waitpid is actually called, and only one of them will be the one that affects your code; I think it's the one in _try_wait. So (untested):

class ResourcePopen(subprocess.Popen):
    def _try_wait(self, wait_flags):
        """All callers to this function MUST hold self._waitpid_lock."""
        try:
            (pid, sts, res) = _eintr_retry_call(os.wait4, self.pid, wait_flags)
        except OSError as e:
            if e.errno != errno.ECHILD:
                raise
            # This happens if SIGCLD is set to be ignored or waiting
            # for child processes has otherwise been disabled for our
            # process.  This child is dead, we can't get the status.
            pid = self.pid
            sts = 0
        else:
            self.rusage = res
        return (pid, sts)

def resource_call(*popenargs, timeout=None, **kwargs):
    """Run command with arguments.  Wait for command to complete or
    timeout, then return the returncode attribute and resource usage.

    The arguments are the same as for the Popen constructor.  Example:

    retcode, rusage = call(["ls", "-l"])
    """
    with ResourcePopen(*popenargs, **kwargs) as p:
        try:
            retcode = p.wait(timeout=timeout)
            return retcode, p.rusage
        except:
            p.kill()
            p.wait()
            raise

And now:

retcode, rusage = resource_call(['spam', 'eggs'])
print('spam used {}s of system time'.format(rusage.ru_stime))

Compare that to using a hybrid with psutil (which won't even work when used this way on many platforms…):

p = subprocess.Popen(['spam', 'eggs'])
ps = psutil.Process(p.pid)
p.wait()
print('spam used {}s of system time'.format(ps.cpu_times().system))

Of course the latter isn't more complex for not good reason, it's more complex because it's a whole lot more powerful and flexible; you can also get all kinds of data that rusage doesn't include, and you can get information every second while the process is running instead of waiting until it's done, and you can use it on Windows, and so on…

这篇关于测量外部程序使用的时间,内存量和cpu的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆