Python递归生成器性能 [英] Python recursive generators performance

查看:74
本文介绍了Python递归生成器性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在python中,当将纯递归函数更改为递归生成器(而不是普通生成器)时,性能似乎会下降.

In python, when changing a purely recursive function into a recursive generator (not a plain generator) the performance seems to be degrading.

例如,这是找到列表的所有组合的两个函数之间的性能比较:

For example, here is a performance comparison between two functions which find all combinations of a list:

from datetime import datetime as dt

def rec_subsets(ms, i=0, s=[]):
    if i == len(ms):
        # do something with s
        return
    rec_subsets(ms, i+1, s)
    rec_subsets(ms, i+1, s + [ms[i]])

def gen_subsets(ms, i=0, s=[]):
    if i == len(ms):
        yield s
        return
    for a in gen_subsets(ms, i+1, s): yield a
    for a in gen_subsets(ms, i+1, s + [ms[i]]): yield a

t1 = dt.now()
rec_subsets(range(20))
t2 = dt.now()
print t2 - t1

t1 = dt.now()
for _ in gen_subsets(range(20)): pass
t2 = dt.now()
print t2 - t1

具有以下输出:

0:00:01.027000  # rec_subsets
0:00:02.860000  # gen_subsets

人们自然希望 gen_subsets rec_subsets 差不多,但是事实并非如此,它要慢得多.

One would naturally expect gen_subsets to be approximately as fast as rec_subsets but this is not the case, it is much slower.

这是正常现象还是我缺少什么?

Is this normal or am I missing something?

推荐答案

rec_subsets()仍然更快(对于range(20)),即使在# do something with s处添加了result.append(s)gen_subsets()rec_subsets()已消耗.

rec_subsets() is still faster (for range(20)) even if result.append(s) is added inplace of # do something with s and the results of both gen_subsets() and rec_subsets() are consumed.

以下 PEP 380(yield from语法支持):

It might be explained by the following quote from PEP 380 (yield from syntax support):

使用专门的语法为优化开辟了可能性 当有一长串的发电机时.这样的链条可能会出现,因为 例如,当递归地遍历树结构时.开销 上下传递__next__()调用并产生值的过程 在最坏的情况下可能会导致本应成为 O(n)的操作变为 情况下, O(n ** 2).

Using a specialised syntax opens up possibilities for optimisation when there is a long chain of generators. Such chains can arise, for instance, when recursively traversing a tree structure. The overhead of passing __next__() calls and yielded values down and up the chain can cause what ought to be an O(n) operation to become, in the worst case, O(n**2).

您可以使用 itertools.combinations() 生成电源集:

You could generate a powerset using itertools.combinations():

from itertools import combinations

def subsets_comb(lst):
    return (comb for r in range(len(lst)+1) for comb in combinations(lst, r))

在我的计算机上使用range(20)的速度更快:

It is faster for range(20) on my machine:

name                    time ratio comment
subsets_comb        227 msec  1.00 [range(0, 20)]
subsets_ipowerset   476 msec  2.10 [range(0, 20)]
subsets_rec         957 msec  4.22 [range(0, 20)]
subsets_gen_pep380 2.34  sec 10.29 [range(0, 20)]
subsets_gen        2.63  sec 11.59 [range(0, 20)]

要重现结果,请运行time-subsets.py .

To reproduce the results, run time-subsets.py.

这篇关于Python递归生成器性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆