用python处理遗传密码? [英] processing the genetic code with python?

查看:62
本文介绍了用python处理遗传密码?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有许多记事本文件,都包含很长的遗传代码/ b $ b代码。他们是这个样子:


atggctaaactgaccaagcgcatgcgtgttatccgcgagaaagttgatgc aaccaaacag

tacgacatcaacgaagctatcgcactgctgaaagagctggcgactgctaa attcgtagaa

agcgtggacgtagctgttaacctcggcatcgacgctcgtaaatctgacca gaacgtacgt

ggtgcaactgtactgccgcacggtactggccgttccgttcgcgtagccgt atttacccaa


基本上,我想设计一个程序,使用python可以打开并且

读取这些文件。但是,我希望它们在

时间读取3个碱基对(通过密码子分析它们的密码子)并找到每个

密码子赋值的值它。下面是一个例子:


**如果三个碱基对是UUU,分配给它的值(来自

密码子值表)将是0.296


程序必须一次读取所有序列三对,然后我想要获得每个密码子的所有值,将它们相乘并且

将它们置于1 /密码子序列长度的力量中

(这是整个序列的长度除以3)。


然而,为了使事情变得更复杂,笔记本序列

是小写的,密码子值表是大写的,所以

序列需要转换为大写。此外,DNA中的Ts

序列需要更改为Us(再次匹配密码子值

表)。最后,在读取和分析DNA序列之前,我需要删除前50个密码子(即前150个字母)和

最后20个密码子(最后一个)来自DNA序列的60个字母)。我还有

一直遇到问题,确保程序一次读取所有序列3

字母。


我我已经尝试过各种各样的方法,但是继续沿着

的方式解开。有没有人对他们如何处理这个

问题有任何建议?

感谢您的任何帮助!

I have many notepad documents that all contain long chunks of genetic
code. They look something like this:

atggctaaactgaccaagcgcatgcgtgttatccgcgagaaagttgatgc aaccaaacag
tacgacatcaacgaagctatcgcactgctgaaagagctggcgactgctaa attcgtagaa
agcgtggacgtagctgttaacctcggcatcgacgctcgtaaatctgacca gaacgtacgt
ggtgcaactgtactgccgcacggtactggccgttccgttcgcgtagccgt atttacccaa

Basically, I want to design a program using python that can open and
read these documents. However, I want them to be read 3 base pairs at a
time (to analyse them codon by codon) and find the value that each
codon has a value assigned to it. An example of this is below:

** If the three base pairs were UUU the value assigned to it (from the
codon value table) would be 0.296

The program has to read all the sequence three pairs at a time, then I
want to get all the values for each codon, multiply them together and
put them to the power of 1 / the length of the sequence in codons
(which is the length of the whole sequence divided by three).

However, to make things even more complicated, the notebook sequences
are in lowercase and the codon value table is in uppercase, so the
sequences need to be converted into uppercase. Also, the Ts in the DNA
sequences need to be changed to Us (again to match the codon value
table). And finally, before the DNA sequences are read and analysed I
need to remove the first 50 codons (i.e. the first 150 letters) and the
last 20 codons (the last 60 letters) from the DNA sequence. I''ve also
been having problems ensuring the program reads ALL the sequence 3
letters at a time.

I''ve tried various ways of doing this but keep coming unstuck along the
way. Has anyone got any suggestions for how they would tackle this
problem?
Thanks for any help recieved!

推荐答案

nuttydevil schrieb:
nuttydevil schrieb:
我已经尝试了各种方法,但继续沿着
方式不断前进。有没有人对他们如何解决这个问题有任何建议?
感谢您的帮助!
I''ve tried various ways of doing this but keep coming unstuck along the
way. Has anyone got any suggestions for how they would tackle this
problem?
Thanks for any help recieved!




向我们展示您的方式,告诉我们你被卡住的地方 - 然后我们就可以帮你了。


Diez



Show us your ways, show us where you got stuck - then we''d might be able to help you.

Diez


Diez B. Roggisch写道:
Diez B. Roggisch wrote:
nuttydevil schrieb:
nuttydevil schrieb:
我已经尝试了各种方法,但继续沿着
方式不断前进。有没有人对他们如何解决这个问题有任何建议?
感谢您收到的任何帮助!
I''ve tried various ways of doing this but keep coming unstuck along the
way. Has anyone got any suggestions for how they would tackle this
problem?
Thanks for any help recieved!



向我们展示您的方式,告诉我们您遇到的问题 - 那么我们也许可以帮到你。


Show us your ways, show us where you got stuck - then we''d might be able to help you.




另外,看看biopython包,几乎可以肯定是
为您节省大量时间,例如您描述的任务。

http://www.biopython.org/


问候

Steve

-

Steve Holden +44 150 684 7255 +1 800 494 3119

Holden Web LLC / Ltd www.holdenweb.com

爱我,爱我的博客holdenweb.blogspot.com



Also, take a look at the biopython package, which will almost certainly
save you large amounts of time on tasks such as the one you describe.

http://www.biopython.org/

regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC/Ltd www.holdenweb.com
Love me, love my blog holdenweb.blogspot.com


在文章< 11 ********************** @ e56g2000cwe.googlegroups .com> ;,

nuttydevil< sj ***@sussex.ac。 UK>写道:
In article <11**********************@e56g2000cwe.googlegroups .com>,
nuttydevil <sj***@sussex.ac.uk> wrote:
我有许多记事本文件都包含很长的遗传代码。他们是这个样子:

atggctaaactgaccaagcgcatgcgtgttatccgcgagaaagttgatg caaccaaacag
tacgacatcaacgaagctatcgcactgctgaaagagctggcgactgcta aattcgtagaa
agcgtggacgtagctgttaacctcggcatcgacgctcgtaaatctgacc agaacgtacgt
ggtgcaactgtactgccgcacggtactggccgttccgttcgcgtagccg tatttacccaa

基本上,我想设计一个程序使用可以打开和阅读这些文档的python。
I have many notepad documents that all contain long chunks of genetic
code. They look something like this:

atggctaaactgaccaagcgcatgcgtgttatccgcgagaaagttgatg caaccaaacag
tacgacatcaacgaagctatcgcactgctgaaagagctggcgactgcta aattcgtagaa
agcgtggacgtagctgttaacctcggcatcgacgctcgtaaatctgacc agaacgtacgt
ggtgcaactgtactgccgcacggtactggccgttccgttcgcgtagccg tatttacccaa

Basically, I want to design a program using python that can open and
read these documents.




从google搜索biopython开始。



Start by googling for biopython.


这篇关于用python处理遗传密码?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆