为什么将列表作为参数传递比传递生成器要好? [英] Why passing a list as a parameter performs better than passing a generator?

查看:89
本文介绍了为什么将列表作为参数传递比传递生成器要好?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在回答

I was making an answer for this question, and when I tested the timing for my solution I came up with a contradiction to what I thought was correct.

提出问题的人想找到一种方法来知道另一个列表中包含多少个不同的列表. (有关详细信息,您可以检查问题)

The guy who made the question wanted to find a way to know how many different lists were contained within another list. (for more information, you can check the question)

我的回答基本上是这个功能:

My answer was basically this function:

def how_many_different_lists(lists):
    s = set(str(list_) for list_ in lists)
    return len(s)

现在,当我测量运行时间并将其与基本相同的功能进行比较时,情况就出现了,但是将列表而不是生成器作为参数传递给set():

Now, the situation came when I measured the time it takes to run and I compared it against basically the same function, but passing a list instead of a generator as a parameter to set():

def the_other_function(lists):
    s = set([str(list_) for list_ in lists])
    return len(s)

这是我用于测试功能的装饰器:

This is the decorator I use for testing functions:

import time

def timer(func):
    def func_decorated(*args):
        start_time = time.clock()
        result = func(*args)   
        print(time.clock() - start_time, "seconds")
        return result
    return func_decorated

这是给定输入的结果:

>>> list1 = [[1,2,3],[1,2,3],[1,2,2],[1,2,2]]
>>> how_many_different_lists(list1)
6.916326725558974e-05 seconds
2
>>> the_other_function(list1)
3.882067261429256e-05 seconds
2

即使是较大的列表:

# (52 elements)
>>> list2= [[1,2,3],[1,2,3],[1,2,2],[1,2,2],[1,2,3],[1,2,3],[1,2,2],[1,2,2],[1,2,3],[1,2,3],[1,2,2],[1,2,2],[1,2,3],[1,2,3],[1,2,2],[1,2,2],[1,2,3],[1,2,3],[1,2,2],[1,2,2],[1,2,3],[1,2,3],[1,2,2],[1,2,2],[1,2,3],[1,2,3],[1,2,2],[1,2,2],[1,2,3],[1,2,3],[1,2,2],[1,2,2],[1,2,3],[1,2,3],[1,2,2],[1,2,2],[1,2,3],[1,2,3],[1,2,2],[1,2,2],[1,2,3],[1,2,3],[1,2,2],[1,2,2],[1,2,3],[1,2,3],[1,2,2],[1,2,2],[1,2,3],[1,2,3],[1,2,2],[1,2,2]]
>>> how_many_different_lists(list2)
0.00023560132331112982 seconds
2
>>> the_other_function(list2)
0.00021329059177332965 seconds
2

现在,我的问题是:为什么第二个示例比第一个示例快?发电机不是因为生产按需"元素的事实而更快吗?我曾经以为创建列表并遍历列表会比较慢.

Now, my question is: Why is the second example faster than the first one? Aren't generators supposed to be faster due to the fact that the produce the elements "on demand"? I used to think that making a list and iterating through it was slower.

PS:我已经测试了很多次,得到的结果基本相同.

PS: I have tested many many times getting basically the same results.

推荐答案

我一直在对您的函数进行基准测试:

I have been benchmarking your functions:

from simple_benchmark import BenchmarkBuilder
from random import choice

b = BenchmarkBuilder()
from operator import setitem


@b.add_function()
def how_many_different_lists(lists):
    s = set(str(list_) for list_ in lists)
    return len(s)


@b.add_function()
def the_other_function(lists):
    s = set([str(list_) for list_ in lists])
    return len(s)


@b.add_arguments('Number of lists in the list')
def argument_provider():
    for exp in range(2, 18):
        size = 2**exp

        yield size,  [list(range(choice(range(100)))) for _ in range(size)]


r = b.run()
r.plot()

生成器是懒惰的,因为与列表理解相比,生成器表达式将动态创建项,而列表理解将在内存中创建整个列表.您可以在此处了解更多信息:生成器表达式与列表理解

Generators are lazy because generator expression will create the items on the fly in comparison with list comprehension which will create the entire list in memory. You can read more here: Generator Expressions vs. List Comprehension

从基准测试中可以看出,它们之间没有太大差异.

As you can see from the benchmark there is not such a big difference between them.

这篇关于为什么将列表作为参数传递比传递生成器要好?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆