Python:多个共识序列 [英] Python: Multiple Consensus sequences
问题描述
从dna序列列表开始,我必须返回所有可能的共识(结果是 每个位置的核苷酸频率最高的序列).如果在某些位置,核苷酸具有 在相同的最高频率下,我必须获得所有具有最高频率的组合. 我还必须返回配置文件矩阵(每个序列每个核苷酸的频率矩阵).
starting from a list of dna sequences, I must have in return all the possible consensus (the resulting sequence with the highest nucleotide frequency in each position) sequences. If in some positions the nucleotides have the same highest frequency, I must obtain all possible combinations with the highest frequency. I also must have in return the profile matrix ( a matrix with the frequencies of each nucleotide for each sequence).
到目前为止,这是我的代码(但它仅返回一个共识序列):
This is my code so far (but it returns only one consensus sequence):
seqList = ['TTCAAGCT','TGGCAACT','TTGGATCT','TAGCAACC','TTGGAACT','ATGCCATT','ATGGCACT']
n = len(seqList[0])
profile = { 'T':[0]*n,'G':[0]*n ,'C':[0]*n,'A':[0]*n }
for seq in seqList:
for i, char in enumerate(seq):
profile[char][i] += 1
consensus = ""
for i in range(n):
max_count = 0
max_nt = 'x'
for nt in "ACGT":
if profile[nt][i] > max_count:
max_count = profile[nt][i]
max_nt = nt
consensus += max_nt
print(consensus)
for key, value in profile.items():
print(key,':', " ".join([str(x) for x in value] ))
TTGCAACT
C : 0 0 1 3 2 0 6 1
A : 2 1 0 1 5 5 0 0
G : 0 1 6 3 0 1 0 0
T : 5 5 0 0 0 1 1 6
(如您所见,在第4位,C和G得分最高,这意味着我必须获得两个共有序列)
(As you can see, in position four, C and G have the same highest score, it means I must obtain two consensus sequences)
是否可以修改此代码以获得 所有可能的序列,或者您能为我解释一下如何获得正确结果的逻辑(伪代码)?
Is it possible to modify this code to obtain all the possible sequences, or could you explain me the logic (the pseudocode) how to obtain the right result?
非常感谢您!
推荐答案
我确信还有更好的方法,但这是一个简单的方法:
I'm sure there are better ways but this is a simple one:
bestseqs = [[]]
for i in range(n):
d = {N:profile[N][i] for N in ['T','G','C','A']}
m = max(d.values())
l = [N for N in ['T','G','C','A'] if d[N] == m]
bestseqs = [ s+[N] for N in l for s in bestseqs ]
for s in bestseqs:
print(''.join(s))
# output:
ATGGAACT
ATGCAACT
这篇关于Python:多个共识序列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!