"lambda"与"operator.attrgetter('xxx')"作为排序键功能 [英] "lambda" vs. "operator.attrgetter('xxx')" as a sort key function
问题描述
我正在看一些使用比较函数进行很多排序调用的代码,似乎应该使用关键函数.
I am looking at some code that has a lot of sort calls using comparison functions, and it seems like it should be using key functions.
如果要更改seq.sort(lambda x,y: cmp(x.xxx, y.xxx))
,则最好:
seq.sort(key=operator.attrgetter('xxx'))
或:
seq.sort(key=lambda a:a.xxx)
我也将对更改现有有效代码的优点的评论感兴趣.
I would also be interested in comments on the merits of making changes to existing code that works.
推荐答案
在attrgetter('attributename')
和lambda o: o.attributename
之间纯粹选择作为排序键时,使用attrgetter()
是 faster 选项这两个.
When choosing purely between attrgetter('attributename')
and lambda o: o.attributename
as a sort key, then using attrgetter()
is the faster option of the two.
请记住,在排序之前,键函数仅对列表中的每个元素应用一次,因此,为了进行比较,我们可以在时间试用中直接使用它们:
Remember that the key function is only applied once to each element in the list, before sorting, so to compare the two we can use them directly in a time trial:
>>> from timeit import Timer
>>> from random import randint
>>> from dataclasses import dataclass, field
>>> @dataclass
... class Foo:
... bar: int = field(default_factory=lambda: randint(1, 10**6))
...
>>> testdata = [Foo() for _ in range(1000)]
>>> def test_function(objects, key):
... [key(o) for o in objects]
...
>>> stmt = 't(testdata, key)'
>>> setup = 'from __main__ import test_function as t, testdata; '
>>> tests = {
... 'lambda': setup + 'key=lambda o: o.bar',
... 'attrgetter': setup + 'from operator import attrgetter; key=attrgetter("bar")'
... }
>>> for name, tsetup in tests.items():
... count, total = Timer(stmt, tsetup).autorange()
... print(f"{name:>10}: {total / count * 10 ** 6:7.3f} microseconds ({count} repetitions)")
...
lambda: 130.495 microseconds (2000 repetitions)
attrgetter: 92.850 microseconds (5000 repetitions)
因此,施加attrgetter('bar')
1000次大约比lambda
快40μs.这是因为调用 Python 函数具有一定的开销,而不是调用诸如attrgetter()
生成的本机函数那样的开销.
So applying attrgetter('bar')
1000 times is roughly 40 μs faster than a lambda
. That's because calling a Python function has a certain amount of overhead, more than calling into a native function such as produced by attrgetter()
.
这种速度优势也转化为更快的排序:
This speed advantage translates into faster sorting too:
>>> def test_function(objects, key):
... sorted(objects, key=key)
...
>>> for name, tsetup in tests.items():
... count, total = Timer(stmt, tsetup).autorange()
... print(f"{name:>10}: {total / count * 10 ** 6:7.3f} microseconds ({count} repetitions)")
...
lambda: 218.715 microseconds (1000 repetitions)
attrgetter: 169.064 microseconds (2000 repetitions)
这篇关于"lambda"与"operator.attrgetter('xxx')"作为排序键功能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!