有没有什么简单的方法可以对 Python 脚本进行基准测试? [英] Is there any simple way to benchmark Python script?

查看:25
本文介绍了有没有什么简单的方法可以对 Python 脚本进行基准测试?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

通常我使用 shell 命令 time.我的目的是测试数据是小集、中集、大集还是超大集,会占用多少时间和内存.

Usually I use shell command time. My purpose is to test if data is small, medium, large or very large set, how much time and memory usage will be.

是否有任何适用于 Linux 或仅 Python 的工具来执行此操作?

Any tools for Linux or just Python to do this?

推荐答案

看看 timeitpython 分析器pycallgraph.还要确保查看 评论下面由 nikicc 提到SnakeViz".它为您提供了另一个有用的分析数据可视化.

Have a look at timeit, the python profiler and pycallgraph. Also make sure to have a look at the comment below by nikicc mentioning "SnakeViz". It gives you yet another visualisation of profiling data which can be helpful.

def test():
    """Stupid test function"""
    lst = []
    for i in range(100):
        lst.append(i)

if __name__ == '__main__':
    import timeit
    print(timeit.timeit("test()", setup="from __main__ import test"))

    # For Python>=3.5 one can also write:
    print(timeit.timeit("test()", globals=locals()))

本质上,您可以将python代码作为字符串参数传递给它,它会以指定的次数运行并打印执行时间.docs 中的重要部分:

Essentially, you can pass it python code as a string parameter, and it will run in the specified amount of times and prints the execution time. The important bits from the docs:

timeit.timeit(stmt='pass', setup='pass', timer=, number=1000000, globals=None)使用给定的语句 setup 创建一个 Timer 实例code 和 timer 函数并运行它的 timeit 方法数量 次执行.可选的 globals 参数指定执行代码的命名空间.

timeit.timeit(stmt='pass', setup='pass', timer=<default timer>, number=1000000, globals=None) Create a Timer instance with the given statement, setup code and timer function and run its timeit method with number executions. The optional globals argument specifies a namespace in which to execute the code.

...和:

Timer.timeit(number=1000000)时间 number 次执行主语句.这将执行设置语句一次,然后返回执行main所花费的时间语句多次,以秒为单位作为浮点数.参数是循环的次数,默认为一百万.main语句、setup语句和timer函数要使用的传递给构造函数.

Timer.timeit(number=1000000) Time number executions of the main statement. This executes the setup statement once, and then returns the time it takes to execute the main statement a number of times, measured in seconds as a float. The argument is the number of times through the loop, defaulting to one million. The main statement, the setup statement and the timer function to be used are passed to the constructor.

注意:默认情况下,timeit 在计时期间暂时关闭垃圾收集.这种方法的优点是它使独立计时更具可比性.这个缺点是GC 可能是性能的重要组成部分被测量的功能.如果是这样,可以重新启用 GC 作为第一个setup 字符串中的语句.例如:

Note: By default, timeit temporarily turns off garbage collection during the timing. The advantage of this approach is that it makes independent timings more comparable. This disadvantage is that GC may be an important component of the performance of the function being measured. If so, GC can be re-enabled as the first statement in the setup string. For example:

timeit.Timer('for i in xrange(10): oct(i)', 'gc.enable()').timeit()

分析

分析将使您更详细地了解正在发生的事情.这是即时示例"来自官方文档:

Profiling

Profiling will give you a much more detailed idea about what's going on. Here's the "instant example" from the official docs:

import cProfile
import re
cProfile.run('re.compile("foo|bar")')

哪个会给你:

      197 function calls (192 primitive calls) in 0.002 seconds

Ordered by: standard name

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     1    0.000    0.000    0.001    0.001 <string>:1(<module>)
     1    0.000    0.000    0.001    0.001 re.py:212(compile)
     1    0.000    0.000    0.001    0.001 re.py:268(_compile)
     1    0.000    0.000    0.000    0.000 sre_compile.py:172(_compile_charset)
     1    0.000    0.000    0.000    0.000 sre_compile.py:201(_optimize_charset)
     4    0.000    0.000    0.000    0.000 sre_compile.py:25(_identityfunction)
   3/1    0.000    0.000    0.000    0.000 sre_compile.py:33(_compile)

这两个模块都应该让您知道在哪里寻找瓶颈.

Both of these modules should give you an idea about where to look for bottlenecks.

此外,要掌握 profile 的输出,请查看 这篇文章

Also, to get to grips with the output of profile, have a look at this post

注意 pycallgraph 已被正式弃用

NOTE pycallgraph has been officially abandoned since Feb. 2018. As of Dec. 2020 it was still working on Python 3.6 though. As long as there are no core changes in how python exposes the profiling API it should remain a helpful tool though.

该模块使用 graphviz 创建如下所示的调用图:

This module uses graphviz to create callgraphs like the following:

您可以通过颜色轻松查看哪些路径使用时间最长.您可以使用 pycallgraph API 或使用打包脚本创建它们:

You can easily see which paths used up the most time by colour. You can either create them using the pycallgraph API, or using a packaged script:

pycallgraph graphviz -- ./mypythonscript.py

虽然开销相当可观.因此,对于已经长时间运行的流程,创建图表可能需要一些时间.

The overhead is quite considerable though. So for already long-running processes, creating the graph can take some time.

这篇关于有没有什么简单的方法可以对 Python 脚本进行基准测试?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆