排序列表时嵌套的 lambda 语句 [英] Nested lambda statements when sorting lists

查看:15
本文介绍了排序列表时嵌套的 lambda 语句的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想先按数字排序下面的列表,然后按文本排序.

lst = ['b-3', 'a-2', 'c-4', 'd-2']# 结果:# ['a-2', 'd-2', 'b-3', 'c-4']

尝试 1

res = sorted(lst, key=lambda x: (int(x.split('-')[1]), x.split('-')[0]))

我对此并不满意,因为它需要将字符串拆分两次才能提取相关组件.

尝试 2

我想出了以下解决方案.但我希望通过 Pythonic lambda 语句有一个更简洁的解决方案.

def sorter_func(x):text, num = x.split('-')返回 int(num), 文本res = sorted(lst, key=sorter_func)

我查看了

总结:它可读且快速

用于重现基准的代码.它需要安装 simple_benchmark 才能工作(免责声明:这是我的自己的库)但可能有等效的框架来完成此类任务,但我只是熟悉它:

# 我的规格:Windows 10,Python 3.6.6 (conda)导入工具z按原样导入迭代实用程序def approach_jpp_1(lst):return sorted(lst, key=lambda x: (int(x.split('-')[1]), x.split('-')[0]))def approach_jpp_2(lst):def sorter_func(x):text, num = x.split('-')返回 int(num), 文本返回排序(lst,key=sorter_func)def jpp_nested_lambda(lst):return sorted(lst, key=lambda x: (lambda y: (int(y[1]), y[0]))(x.split('-')))def toolz_compose(lst):return sorted(lst, key=toolz.compose(lambda x: (int(x[1]), x[0]), lambda x: x.split('-')))def AshwiniChaudhary_list_comprehension(lst):return sorted(lst, key=lambda x: [(int(num), text) for text, num in [x.split('-')]])def AshwiniChaudhary_next(lst):return sorted(lst, key=lambda x: next((int(num), text) for text, num in [x.split('-')]))def PaulCornelius(lst):return sorted(lst, key=lambda x: tuple(f(a) for f, a in zip((int, str), reversed(x.split('-')))))def JeanFrançoisFabre(lst):return sorted(lst, key=lambda s : [x if i else int(x) for i,x in enumerate(reversed(s.split("-")))])defiteration_utilities_chained(lst):return sorted(lst, key=it.chained(lambda x: x.split('-'), lambda x: (int(x[1]), x[0])))从 simple_benchmark 导入基准随机导入导入字符串函数 = [方法_jpp_1,方法_jpp_2,jpp_nested_lambda,工具z_compose,AshwiniChaudhary_list_comprehension,AshwiniChaudhary_next、PaulCornelius、JeanFrançoisFabre、iteration_utilities_chained]参数 = {2**i: ['-'.join([random.choice(string.ascii_lowercase),str(random.randint(0, 2**(i-1)))])对于范围内的 _(2**i)]对于范围内的 i(3, 15)}b = 基准(函数,参数,'列表大小')%matplotlib 笔记本b.plot_difference_percentage(relative_to=approach_jpp_2)

我冒昧地包含了我自己的库之一的函数组合方法 iteration_utilities.chained:

from iteration_utilities import chainedsorted(lst, key=chained(lambda x: x.split('-'), lambda x: (int(x[1]), x[0])))

它相当快(第 2 或第 3 名),但仍然比使用您自己的函数慢.

<小时>

请注意,如果您使用具有 O(n)(或更好)算法复杂度的函数,例如 min,key 开销会更显着max.那么key-function的常数因子会更显着!

I wish to sort the below list first by the number, then by the text.

lst = ['b-3', 'a-2', 'c-4', 'd-2']

# result:
# ['a-2', 'd-2', 'b-3', 'c-4']

Attempt 1

res = sorted(lst, key=lambda x: (int(x.split('-')[1]), x.split('-')[0]))

I was not happy with this since it required splitting a string twice, to extract the relevant components.

Attempt 2

I came up with the below solution. But I am hoping there is a more succinct solution via Pythonic lambda statements.

def sorter_func(x):
    text, num = x.split('-')
    return int(num), text

res = sorted(lst, key=sorter_func)

I looked at Understanding nested lambda function behaviour in python but couldn't adapt this solution directly. Is there a more succinct way to rewrite the above code?

解决方案

In almost all cases I would simply go with your second attempt. It's readable and concise (I would prefer three simple lines over one complicated line every time!) - even though the function name could be more descriptive. But if you use it as local function that's not going to matter much.

You also have to remember that Python uses a key function, not a cmp (compare) function. So to sort an iterable of length n the key function is called exactly n times, but sorting generally does O(n * log(n)) comparisons. So whenever your key-function has an algorithmic complexity of O(1) the key-function call overhead isn't going to matter (much). That's because:

O(n*log(n)) + O(n)   ==  O(n*log(n))

There's one exception and that's the best case for Pythons sort: In the best case the sort only does O(n) comparisons but that only happens if the iterable is already sorted (or almost sorted). If Python had a compare function (and in Python 2 there really was one) then the constant factors of the function would be much more significant because it would be called O(n * log(n)) times (called once for each comparison).

So don't bother about being more concise or making it much faster (except when you can reduce the big-O without introducing too big constant factors - then you should go for it!), the first concern should be readability. So you should really not do any nested lambdas or any other fancy constructs (except maybe as exercise).

Long story short, simply use your #2:

def sorter_func(x):
    text, num = x.split('-')
    return int(num), text

res = sorted(lst, key=sorter_func)

By the way, it's also the fastest of all proposed approaches (although the difference isn't much):

Summary: It's readable and fast!

Code to reproduce the benchmark. It requires simple_benchmark to be installed for this to work (Disclaimer: It's my own library) but there are probably equivalent frameworks to do this kind of task, but I'm just familiar with it:

# My specs: Windows 10, Python 3.6.6 (conda)

import toolz
import iteration_utilities as it

def approach_jpp_1(lst):
    return sorted(lst, key=lambda x: (int(x.split('-')[1]), x.split('-')[0]))

def approach_jpp_2(lst):
    def sorter_func(x):
        text, num = x.split('-')
        return int(num), text
    return sorted(lst, key=sorter_func)

def jpp_nested_lambda(lst):
    return sorted(lst, key=lambda x: (lambda y: (int(y[1]), y[0]))(x.split('-')))

def toolz_compose(lst):
    return sorted(lst, key=toolz.compose(lambda x: (int(x[1]), x[0]), lambda x: x.split('-')))

def AshwiniChaudhary_list_comprehension(lst):
    return sorted(lst, key=lambda x: [(int(num), text) for text, num in [x.split('-')]])

def AshwiniChaudhary_next(lst):
    return sorted(lst, key=lambda x: next((int(num), text) for text, num in [x.split('-')]))

def PaulCornelius(lst):
    return sorted(lst, key=lambda x: tuple(f(a) for f, a in zip((int, str), reversed(x.split('-')))))

def JeanFrançoisFabre(lst):
    return sorted(lst, key=lambda s : [x if i else int(x) for i,x in enumerate(reversed(s.split("-")))])

def iteration_utilities_chained(lst):
    return sorted(lst, key=it.chained(lambda x: x.split('-'), lambda x: (int(x[1]), x[0])))

from simple_benchmark import benchmark
import random
import string

funcs = [
    approach_jpp_1, approach_jpp_2, jpp_nested_lambda, toolz_compose, AshwiniChaudhary_list_comprehension,
    AshwiniChaudhary_next, PaulCornelius, JeanFrançoisFabre, iteration_utilities_chained
]

arguments = {2**i: ['-'.join([random.choice(string.ascii_lowercase),
                              str(random.randint(0, 2**(i-1)))]) 
                    for _ in range(2**i)] 
             for i in range(3, 15)}

b = benchmark(funcs, arguments, 'list size')

%matplotlib notebook
b.plot_difference_percentage(relative_to=approach_jpp_2)

I took the liberty to include a function composition approach of one of my own libraries iteration_utilities.chained:

from iteration_utilities import chained
sorted(lst, key=chained(lambda x: x.split('-'), lambda x: (int(x[1]), x[0])))

It's quite fast (2nd or 3rd place) but still slower than using your own function.


Note that the key overhead would be more significant if you used a function that had O(n) (or better) algorithmic complexity, for example min or max. Then the constant factors of the key-function would be more significant!

这篇关于排序列表时嵌套的 lambda 语句的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆