在给定比对参数的情况下,是否存在可以计算比对序列得分的功能? [英] Is there a function that can calculate a score for aligned sequences given the alignment parameters?

查看:151
本文介绍了在给定比对参数的情况下,是否存在可以计算比对序列得分的功能?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试对已经对齐的序列评分. 让我们说

I try to score the already-aligned sequences. Let say

seq1 = 'PAVKDLGAEG-ASDKGT--SHVVY----------TI-QLASTFE'
seq2 = 'PAVEDLGATG-ANDKGT--LYNIYARNTEGHPRSTV-QLGSTFE'

具有给定参数

substitution matrix : blosum62
gap open penalty : -5
gap extension penalty : -1

我确实浏览过biopython食谱,但我能得到的只是替代矩阵blogsum62,但我觉得它一定已经有人实现了这种库.

I did look through the biopython cookbook but all i can get is substitution matrix blogsum62 but I feel that it must have someone already implemented this kind of library.

那么任何人都可以建议任何可以解决我的问题的库或最短代码吗?

So can anyone suggest any libraries or shortest code that can solve my problem?

提前谢谢

推荐答案

杰萨达(Jessada)

Jessada,

Blosum62矩阵(请注意拼写;)位于Bio.SubsMat.MatrixInfo中,并且是具有元组解析为分数的字典(因此,('A', 'A')值得4分).它没有间隙,它只是矩阵的一个三角形(因此可能是('T','A')而不是('A','T').Biopython中有一些辅助函数,包括Bio.Pairwise中的一些内容,但这是我想出的答案:

The Blosum62 matrix (note the spelling ;) is in Bio.SubsMat.MatrixInfo and is a dictionary with tuples resolving to scores (so ('A', 'A') is worth 4 pts). It doesn't have the gaps, and it's only one triangle of the matrix (so it might ahve ('T', 'A') but not ('A', 'T'). There are some helper functions in Biopython, including some in Bio.Pairwise, but this is what I came up with as an answer:

from Bio.SubsMat import MatrixInfo

def score_match(pair, matrix):
    if pair not in matrix:
        return matrix[(tuple(reversed(pair)))]
    else:
        return matrix[pair]

def score_pairwise(seq1, seq2, matrix, gap_s, gap_e):
    score = 0
    gap = False
    for i in range(len(seq1)):
        pair = (seq1[i], seq2[i])
        if not gap:
            if '-' in pair:
                gap = True
                score += gap_s
            else:
                score += score_match(pair, matrix)
        else:
            if '-' not in pair:
                gap = False
                score += score_match(pair, matrix)
            else:
                score += gap_e
    return score

seq1 = 'PAVKDLGAEG-ASDKGT--SHVVY----------TI-QLASTFE'
seq2 = 'PAVEDLGATG-ANDKGT--LYNIYARNTEGHPRSTV-QLGSTFE'

blosum = MatrixInfo.blosum62

score_pairwise(seq1, seq2, blosum, -5, -1)

您的对齐方式将返回82.几乎所有方法都可以通过漂亮的方法来完成所有这些操作,但这应该是一个好的开始.

Which returns 82 for your alignment. There's almost certianly prettier ways to do all of this, but that should be a good start.

这篇关于在给定比对参数的情况下,是否存在可以计算比对序列得分的功能?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆