计算文件中某个三联体的数量(DNA密码子分析) [英] count the number of a certain triplet in a file (DNA codon analysis)

查看：75 发布时间：2020/9/21 3:24:34 python shell bioinformatics

本文介绍了计算文件中某个三联体的数量(DNA密码子分析)的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

这个问题实际上是用于DNA密码子分析的，简而言之，假设我有一个像这样的文件:
atgaaaccaaag ...
我想计算此文件中存在的"aaa"三元组的数量.重要的是，三元组从头开始(这意味着atg，aaa，cca，aag，...)，因此在此示例中，结果应为1而不是2'aaa'.
有没有Python或Shellscript方法可以做到这一点?谢谢！

This question is actually for DNA codon analysis, to put it in a simple way, let's say I have a file like this:
atgaaaccaaag...
and I want to count the number of 'aaa' triplet present in this file. Importantly, the triplets start from the very beginning (which means atg,aaa,cca,aag,...) So the result should be 1 instead of 2 'aaa' in this example.
Is there any Python or Shellscript methods to do this? Thanks!

推荐答案

首先读入文件

with open("some.txt") as f:
    file_data = f.read()

然后将其分成3个

codons = [file_data[i:i+3] for i in range(0,len(file_data),3)]

然后计数em

print codons.count('aaa')

像这样

>>> my_codons = 'atgaaaccaaag'
>>> codons = [my_codons[i:i+3] for i in range(0,len(my_codons),3)]
>>> codons
['atg', 'aaa', 'cca', 'aag']
>>> codons.count('aaa')
1

这篇关于计算文件中某个三联体的数量(DNA密码子分析)的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

计算文件中某个三联体的数量(DNA密码子分析) [英] count the number of a certain triplet in a file (DNA codon analysis)

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

计算文件中某个三联体的数量(DNA密码子分析) [英] count the number of a certain triplet in a file (DNA codon analysis)

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭