在幕后如何实现sorted(key = lambda x :)? [英] How is sorted(key=lambda x:) implemented behind the scene?
问题描述
一个例子:
names = ["George Washington", "John Adams", "Thomas Jefferson", "James Madison"]
sorted(names, key=lambda name: name.split()[-1].lower())
我知道key
用于比较不同的名称,但是它可以有两种不同的实现方式:
I know key
is used to compare different names, but it can have two different implementations:
- 首先计算每个名称的所有键,然后以某种方式将键和名称绑定在一起,并对它们进行排序. p
- 每次比较时计算密钥
第一种方法的问题在于,它必须定义另一个数据结构来绑定键和数据.第二种方法的问题在于,密钥可能会被多次计算,即name.split()[-1].lower()
将被执行多次,这非常耗时.
The problem with the first approach is that it has to define another data structure to bind the key and data. The problem with the second approach is that the key might be computed for multiple times, that is, name.split()[-1].lower()
will be executed many times, which is very time-consuming.
我只是想知道Python以哪种方式实现sorted()
.
I am just wondering in which way Python implemented sorted()
.
推荐答案
该键函数仅对每个值一次执行一次,以产生一个(keyvalue, value)
对.然后将其用于排序,然后仅按排序顺序返回值.有时称为 Schwartzian变换 .
The key function is executed just once per value, to produce a (keyvalue, value)
pair; this is then used to sort and later on just the values are returned in the sorted order. This is sometimes called a Schwartzian transform.
您可以自己对此进行测试;您可以计算该函数的调用频率,例如:
You can test this yourself; you could count how often the function is called, for example:
>>> def keyfunc(value):
... keyfunc.count += 1
... return value
...
>>> keyfunc.count = 0
>>> sorted([0, 8, 1, 6, 4, 5, 3, 7, 9, 2], key=keyfunc)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> keyfunc.count
10
或者您可以收集所有传入的值;您会看到它们遵循原始输入顺序:
or you could collect all the values that are being passed in; you'll see that they follow the original input order:
>>> def keyfunc(value):
... keyfunc.arguments.append(value)
... return value
...
>>> keyfunc.arguments = []
>>> sorted([0, 8, 1, 6, 4, 5, 3, 7, 9, 2], key=keyfunc)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> keyfunc.arguments
[0, 8, 1, 6, 4, 5, 3, 7, 9, 2]
If you want to read the CPython source code, the relevant function is called listsort()
, and the keyfunc
is used in the following loop (saved_ob_item
is the input array), which is executed before sorting takes place:
for (i = 0; i < saved_ob_size ; i++) {
keys[i] = PyObject_CallFunctionObjArgs(keyfunc, saved_ob_item[i],
NULL);
if (keys[i] == NULL) {
for (i=i-1 ; i>=0 ; i--)
Py_DECREF(keys[i]);
if (saved_ob_size >= MERGESTATE_TEMP_SIZE/2)
PyMem_FREE(keys);
goto keyfunc_fail;
}
}
lo.keys = keys;
lo.values = saved_ob_item;
因此,最后,您有两个数组,一个带有keys
,另一个带有原始值.所有排序操作并行地作用于两个数组,对lo.keys
中的值进行排序,并同时移动lo.values
中的元素.
so in the end, you have two arrays, one with keys
and one with the original values. All sort operations act on the two arrays in parallel, sorting the values in lo.keys
and moving the elements in lo.values
in tandem.
这篇关于在幕后如何实现sorted(key = lambda x :)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!