使用python计算字典中的基因突变 [英] Counting genetic mutations in dictionary using python
问题描述
我有这种格式的数据:
> abc12
ATCGACAG
> def34
ACCGACG
>abc12
ATCGACAG
>def34
ACCGACG
等
我已经将每个基因存储在字典中,并以>开头的行作为值.因此字典就像{'abc12':'ATCGACAG'等}
I have stored each gene into a dictionary with the lines beginning with > as values. So the dictionary is something like {'abc12':'ATCGACAG', etc.}
现在,我希望能够比较每个基因,以便它计算每个位点上A,T,C或G的数目.
Now I want to be able to compare each gene, so that it counts the number of A's, T's, C's, or G's at each site.
我唯一能想到的就是将字典分解成每个核苷酸位点的列表,并使用带有计数器的zip().这是最好的方法吗?如果是这样,如何将字典分为每个站点的列表?
The only thing I can come up with is to break the dictionary into lists for each nucleotide site and using zip() with a counter. Is this the best way, and if so, how do I break the dictionary into a list for each site?
推荐答案
使用 collections.Counter
:
>>> from collections import Counter
>>> Counter('ATCGACAG')
Counter({'A': 3, 'C': 2, 'G': 2, 'T': 1})
>>> Counter('ACCGACG')
Counter({'C': 3, 'A': 2, 'G': 2})
这篇关于使用python计算字典中的基因突变的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!