匹配两个部分匹配的字符串列表 [英] Matching two string lists that partially match into another list

查看:149
本文介绍了匹配两个部分匹配的字符串列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将一个包含字符串(50个字符串)的列表与一个包含字符串的列表进行匹配,该列表是前一个列表(5个字符串)的某些字符串的一部分.我将发布完整的代码,以便在下面提供上下文,但我也想举一个简短的示例:

I am trying to match a List containing strings (50 strings) with a list containing strings that are part of some of the strings of the previous list (5 strings). I will post the complete code in order to give context below but I also want to give a short example:

List1 = ['abcd12', 'efgh34', 'ijkl56', 'mnop78']

List2 = ['abc', 'ijk']

我想从List1返回在List2中具有匹配项的字符串列表.我尝试使用set.intersection进行某些操作,但看来您不能对此进行部分匹配(或者我无法以有限的能力进行匹配).我也尝试了any(),但无法使其与列表兼容.在我的书中,它说我应该使用嵌套循环,但是我不知道应该使用哪个函数以及如何处理列表.

I want to return a list of the strings from List1 that have matches in List2. I have tried to do something with set.intersection but it seems you can't do partial matches with it (or at I can't with my limited abilities). I also tried any() but I had no success making it work with my lists. In my book it says I should use a nested loop but I don't know which function I should use and how regarding lists.

以下是完整的代码作为参考:

Here is the complete code as reference:

#!/usr/bin/env python3.4
# -*- coding: utf-8 -*-

import random

def generateSequences (n):

    L = []
    dna = ["A","G","C","T"]
    for i in range(int(n)):

        random_sequence=''

        for i in range(50):
            random_sequence+=random.choice(dna)

        L.append(random_sequence)

    print(L)
    return L

def generatePrefixes (p, L):

    S = [x[:20] for x in L]
    D = []
    for i in range(p):
        randomPrefix = random.choice(S)
        D.append(randomPrefix)

    return S, D

if __name__ == "__main__":
    L = generateSequences(15)
    print (L)
    S, D = generatePrefixes(5, L)
    print (S)
    print (D)

edit:因为这被标记为可能重复,所以我想对其进行编辑,以便说在这篇文章中使用了python,另一个用于R.我不知道R是否存在任何相似之处,但是它乍一看对我来说并不像那样.不便之处,敬请谅解.

edit: As this was flagged as a possible duplicate i want to edit this in order to say that in this post python is used and the other is for R. I don't know R and if there are any similarities but it doesn't look like that to me at first glance. Sorry for the inconvenience.

推荐答案

使用嵌套的for循环:

Using a nested for loop:

def intersect(List1, List2):
    # empty list for values that match
    ret = []
    for i in List2:
        for j in List1:
            if i in j:
                ret.append(j)
    return ret

List1 = ['abcd12', 'efgh34', 'ijkl56', 'mnop78']
List2 = ['abc', 'ijk']
print(intersect(List1, List2))

这篇关于匹配两个部分匹配的字符串列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆