给定另一个包含了来自ListA的部分字符串的ListB，对TSV字符串的ListA进行排序 [英] Sort ListA of TSV strings given another ListB that contains partial string from ListA

查看：126 发布时间：2020/5/2 8:28:20 python-3.x list sorting

本文介绍了给定另一个包含了来自ListA的部分字符串的ListB，对TSV字符串的ListA进行排序的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有2个列表:ListA包含由制表符分隔的字符串，而ListB包含与ListA中的字符串部分匹配的字符串.我想通过使ListB的部分字符串与ListA中的字符串匹配，以ListB中找到的相同顺序对ListA中的字符串进行排序.

I have 2 lists: ListA that contains strings delimited by tabs and ListB that contains strings that partially match the strings in ListA. I would like to order the strings in ListA in the same order found in ListB by having the partial strings of ListB match the strings in ListA.

我尝试的是在ListA上循环，用\t分隔每行，用_分隔第5列，并将字符串附加到临时ListC.然后，我订购了ListC，但是我仍然不知道如何在给定ListC的情况下订购实际的ListA.

What I tried is to loop on ListA, split each line by \t, split the 5th column by _ and append the string to a temporary ListC. Then, I ordered ListC but I still don't know how I can order the actual ListA given ListC.

ListA = ['rs141130360\tchr1:16495\tC\t653635\tNC_024540.1\tTranscript\tintron_variant,non_coding_transcript_variant\t-\t-\t-\t-\t-\trs3210724\tG\tMODIFIER\t-\t-1\t-\tSNV\tWASH7P\tEntrezGene\tHGNC:38034\ttranscribed_pseudogene\t-\t-\t-\t-\t-\t-\t-\t-\t-\tRefSeq\tG\tG\tOK\t-\t-\t-\t-\t8/10\t-\t-\tNR_024540.1:n.1080+112C>G\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\n',
         'rs141130360\tchr1:16495\tC\t100287102\tNR_046018.2\tTranscript\tdownstream_gene_variant\t-\t-\t-\t-\t-\trs3210724\tG\tMODIFIER\t2086\t1\t-\tSNV\tDDX11L1\tEntrezGene\tHGNC:37102\ttranscribed_pseudogene\t-\t-\t-\t-\t-\t-\t-\t-\t-\tRefSeq\tG\tG\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\n',
         'rs141130360\tchr1:16495\tC\t102466751\tNG_106918.1\tTranscript\tdownstream_gene_variant\t-\t-\t-\t-\t-\trs3210724\tG\tMODIFIER\t874\t-1\t-\tSNV\tMIR6859-1\tEntrezGene\tHGNC:50039\tmiRNA\t-\t-\t-\t-\t-\t-\t-\t-\t-\tRefSeq\tG\tG\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\n']

ListB = ["NC", "NG", "NM", "NP", "NR", "XM", "XP", "XR", "WP"]
ListC = []


for i in ListA:
    i_split = i.split("\t")[4].split("_")[0]
    ListC.append(i_split)
ListC = sorted(ListC, key=lambda x: ListB.index(x))
print(ListC)

将打印:

['NC', 'NG', 'NR']

我的预期结果如下:

['rs141130360\tchr1:16495\tC\t653635\tNC_024540.1\tTranscript\tintron_variant,non_coding_transcript_variant\t-\t-\t-\t-\t-\trs3210724\tG\tMODIFIER\t-\t-1\t-\tSNV\tWASH7P\tEntrezGene\tHGNC:38034\ttranscribed_pseudogene\t-\t-\t-\t-\t-\t-\t-\t-\t-\tRefSeq\tG\tG\tOK\t-\t-\t-\t-\t8/10\t-\t-\tNR_024540.1:n.1080+112C>G\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\n',
'rs141130360\tchr1:16495\tC\t102466751\tNG_106918.1\tTranscript\tdownstream_gene_variant\t-\t-\t-\t-\t-\trs3210724\tG\tMODIFIER\t874\t-1\t-\tSNV\tMIR6859-1\tEntrezGene\tHGNC:50039\tmiRNA\t-\t-\t-\t-\t-\t-\t-\t-\t-\tRefSeq\tG\tG\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\n', 
'rs141130360\tchr1:16495\tC\t100287102\tNR_046018.2\tTranscript\tdownstream_gene_variant\t-\t-\t-\t-\t-\trs3210724\tG\tMODIFIER\t2086\t1\t-\tSNV\tDDX11L1\tEntrezGene\tHGNC:37102\ttranscribed_pseudogene\t-\t-\t-\t-\t-\t-\t-\t-\t-\tRefSeq\tG\tG\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\n']

给定另一个包含了来自ListA的部分字符串的ListB，对TSV字符串的ListA进行排序 [英] Sort ListA of TSV strings given another ListB that contains partial string from ListA

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

给定另一个包含了来自ListA的部分字符串的ListB，对TSV字符串的ListA进行排序 [英] Sort ListA of TSV strings given another ListB that contains partial string from ListA

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭