在幕后如何实现sorted(key = lambda x :)? [英] How is sorted(key=lambda x:) implemented behind the scene?

查看:260
本文介绍了在幕后如何实现sorted(key = lambda x :)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

一个例子:

names = ["George Washington", "John Adams", "Thomas Jefferson", "James Madison"]
sorted(names, key=lambda name: name.split()[-1].lower())

我知道key用于比较不同的名称,但是它可以有两种不同的实现方式:

I know key is used to compare different names, but it can have two different implementations:

  1. 首先计算每个名称的所有键,然后以某种方式将键和名称绑定在一起,并对它们进行排序. p
  2. 每次比较时计算密钥

第一种方法的问题在于,它必须定义另一个数据结构来绑定键和数据.第二种方法的问题在于,密钥可能会被多次计算,即name.split()[-1].lower()将被执行多次,这非常耗时.

The problem with the first approach is that it has to define another data structure to bind the key and data. The problem with the second approach is that the key might be computed for multiple times, that is, name.split()[-1].lower() will be executed many times, which is very time-consuming.

我只是想知道Python以哪种方式实现sorted().

I am just wondering in which way Python implemented sorted().

推荐答案

该键函数仅对每个值一次执行一次,以产生一个(keyvalue, value)对.然后将其用于排序,然后仅按排序顺序返回值.有时称为 Schwartzian变换 .

The key function is executed just once per value, to produce a (keyvalue, value) pair; this is then used to sort and later on just the values are returned in the sorted order. This is sometimes called a Schwartzian transform.

您可以自己对此进行测试;您可以计算该函数的调用频率,例如:

You can test this yourself; you could count how often the function is called, for example:

>>> def keyfunc(value):
...     keyfunc.count += 1
...     return value
...
>>> keyfunc.count = 0
>>> sorted([0, 8, 1, 6, 4, 5, 3, 7, 9, 2], key=keyfunc)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> keyfunc.count
10

或者您可以收集所有传入的值;您会看到它们遵循原始输入顺序:

or you could collect all the values that are being passed in; you'll see that they follow the original input order:

>>> def keyfunc(value):
...     keyfunc.arguments.append(value)
...     return value
...
>>> keyfunc.arguments = []
>>> sorted([0, 8, 1, 6, 4, 5, 3, 7, 9, 2], key=keyfunc)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> keyfunc.arguments
[0, 8, 1, 6, 4, 5, 3, 7, 9, 2]

如果您想阅读CPython源代码,则相关函数称为

If you want to read the CPython source code, the relevant function is called listsort(), and the keyfunc is used in the following loop (saved_ob_item is the input array), which is executed before sorting takes place:

for (i = 0; i < saved_ob_size ; i++) {
    keys[i] = PyObject_CallFunctionObjArgs(keyfunc, saved_ob_item[i],
                                           NULL);
    if (keys[i] == NULL) {
        for (i=i-1 ; i>=0 ; i--)
            Py_DECREF(keys[i]);
        if (saved_ob_size >= MERGESTATE_TEMP_SIZE/2)
            PyMem_FREE(keys);
        goto keyfunc_fail;
    }
}

lo.keys = keys;
lo.values = saved_ob_item;

因此,最后,您有两个数组,一个带有keys,另一个带有原始值.所有排序操作并行地作用于两个数组,对lo.keys中的值进行排序,并同时移动lo.values中的元素.

so in the end, you have two arrays, one with keys and one with the original values. All sort operations act on the two arrays in parallel, sorting the values in lo.keys and moving the elements in lo.values in tandem.

这篇关于在幕后如何实现sorted(key = lambda x :)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆