Python-从字符串列表中删除另一个元素的子字符串中的任何元素 [英] Python - Remove any element from a list of strings that is a substring of another element

查看:497
本文介绍了Python-从字符串列表中删除另一个元素的子字符串中的任何元素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以从下面的字符串列表开始

So starting with a list of strings, as below

string_list = [休息",休息",看",看",它",吐"]

string_list = ['rest', 'resting', 'look', 'looked', 'it', 'spit']

我想从列表中删除作为另一个元素的子串的任何元素,例如给出结果...

I want to remove any element from the list that is a substring of another element, giving the result for instance...

string_list = [休息",看上去",吐"]

string_list = ['resting', 'looked', 'spit']

我有一些实现此目的的代码,但是它很丑陋,而且可能不必要地复杂.有没有一种简单的方法可以在Python中做到这一点?

I have some code that acheives this but it's embarrassingly ugly and probably needlessly complex. Is there a simple way to do this in Python?

推荐答案

第一个构建块:子字符串.

First building block: substring.

您可以使用in进行检查:

>>> 'rest' in 'resting'
True
>>> 'sing' in 'resting'
False

接下来,我们将选择创建新列表的幼稚方法.我们将一项一一添加到新列表中,检查它们是否是子字符串.

Next, we're going to choose the naive method of creating a new list. We'll add items one by one into the new list, checking if they are a substring or not.

def substringSieve(string_list):
    out = []
    for s in string_list:
        if not any([s in r for r in string_list if s != r]):
            out.append(s)
    return out

您可以通过排序以减少比较次数来加快速度(毕竟,较长的字符串永远不能是较短/等长字符串的子字符串):

You can speed it up by sorting to reduce the number of comparisons (after all, a longer string can never be a substring of a shorter/equal length string):

def substringSieve(string_list):
    string_list.sort(key=lambda s: len(s), reverse=True)
    out = []
    for s in string_list:
        if not any([s in o for o in out]):
            out.append(s)
    return out

这篇关于Python-从字符串列表中删除另一个元素的子字符串中的任何元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆