优雅的在列表中查找子列表 [英] elegant find sub-list in list

查看:15
本文介绍了优雅的在列表中查找子列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

给定一个包含噪声包围的已知模式的列表,是否有一种优雅的方法来获取与该模式相同的所有项目.请参阅下面的我的粗代码.

Given a list containing a known pattern surrounded by noise, is there an elegant way to get all items that equal the pattern. See below for my crude code.

list_with_noise = [7,2,1,2,3,4,2,1,2,3,4,9,9,1,2,3,4,7,4,3,1,2,3,5]
known_pattern = [1,2,3,4]
res = []


for i in list_with_noise:
    for j in known_pattern:
        if i == j:
            res.append(i)
            continue

print res

我们会得到 2, 1, 2, 3, 4, 2, 1, 2, 3, 4, 1, 2, 3, 4, 4, 3

奖励:如果不存在完整模式,请避免附加 i(即,允许 1,2,3,4 但不允许 1,2,3)

bonus: avoid appending i if the full pattern is not present (ie., allow 1,2,3,4 but not 1,2,3)

示例:

find_sublists_in_list([7,2,1,2,3,4,2,1,2,3,4,9,9,1,2,3,4,7,4,3,1,2,3,5],[1,2,3,4])

[1,2,3,4],[1,2,3,4],[1,2,3,4]


find_sublists_in_list([7,2,1,2,3,2,1,2,3,6,9,9,1,2,3,4,7,4,3,1,2,6],[1,2,3,4])

[1,2,3],[1,2,3],[1,2,3]

列表包含命名元组.

推荐答案

我知道这个问题已经有 5 个月了并且已经接受"了,但是在谷歌上搜索一个非常相似的问题让我想到了这个问题,所有的答案似乎都有几个相当重要的问题,加上我很无聊,想尝试一下 SO 的答案,所以我只是想说说我发现的东西.

I know this question is 5 months old and already "accepted", but googling a very similar problem brought me to this question and all the answers seem to have a couple of rather significant problems, plus I'm bored and want to try my hand at a SO answer, so I'm just going to rattle off what I've found.

据我所知,问题的第一部分非常简单:只需返回原始列表,其中过滤掉所有不在模式"中的元素.按照这个想法,我想到的第一个代码使用了 filter() 函数:

The first part of the question, as I understand it, is pretty trivial: just return the original list with all the elements not in the "pattern" filtered out. Following that thinking, the first code I thought of used the filter() function:

def subfinder(mylist, pattern):
    return list(filter(lambda x: x in pattern, mylist))

我想说这个解决方案肯定比原始解决方案更简洁,但它并没有更快,或者至少不是明显,如果没有很好的理由使用它们,我会尽量避免使用 lambda 表达式.事实上,我能想出的最佳解决方案涉及一个简单的列表理解:

I would say that this solution is definitely more succinct than the original solution, but it's not any faster, or at least not appreciably, and I try to avoid lambda expressions if there's not a very good reason for using them. In fact, the best solution I could come up with involved a simple list comprehension:

def subfinder(mylist, pattern):
    pattern = set(pattern)
    return [x for x in mylist if x in pattern]

此解决方案比原始解决方案更优雅且速度明显更快:理解速度比原始解决方案快约 120%,同时将模式转换为一组第一次碰撞,在我的测试中速度高达 320%.

This solution is both more elegant and significantly faster than the original: the comprehension is about 120% faster than the original, while casting the pattern into a set first bumps that up to a whopping 320% faster in my tests.

现在奖励:我会直接跳进去,我的解决方案如下:

Now for the bonus: I'll just jump right into it, my solution is as follows:

def subfinder(mylist, pattern):
    matches = []
    for i in range(len(mylist)):
        if mylist[i] == pattern[0] and mylist[i:i+len(pattern)] == pattern:
            matches.append(pattern)
    return matches

这是 Steven Rumbalski 的低效单行"的一种变体,通过添加mylist[i] == pattern[0]"检查并感谢 python 的短路评估,比两者都快得多原始声明和 itertools 版本(以及据我所知的所有其他提供的解决方案)它甚至支持重叠模式.就这样吧.

This is a variation of Steven Rumbalski's "inefficient one liner", that, with the addition of the "mylist[i] == pattern[0]" check and thanks to python's short-circuit evaluation, is significantly faster than both the original statement and the itertools version (and every other offered solution as far as I can tell) and it even supports overlapping patterns. So there you go.

这篇关于优雅的在列表中查找子列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆