从PDB文件中仅提取我们需要的链 [英] Extracting only the chains that we need from a PDB file

查看:131
本文介绍了从PDB文件中仅提取我们需要的链的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要从PDB文件中提取特定的链(不止一个链).如何从PDB文件提取链?.这是相同的问题,并带有已标记"答案,回答了我的问题.但是它在python 3中不起作用.它一个接一个地给出错误.有人知道我该如何在python 3中工作吗?

I need to extract specific chains from PDB files( Sometiems more than one chain). How to extract chains from a PDB file?. It's the same question and "marked" answer, answers my problem. But it does not work in python 3. It gives errors one after the other. Does anybody knows how can i work this in python 3?

或任何其他针对相同类型问题的代码

Or any other code for the same kind of problem

谢谢.

import os
from Bio import PDB


class ChainSplitter:
    def __init__(self, out_dir=None):
        """ Create parsing and writing objects, specify output directory. """
        self.parser = PDB.PDBParser()
        self.writer = PDB.PDBIO()
        if out_dir is None:
            out_dir = os.path.join(os.getcwd(), "chain_PDBs")
        self.out_dir = out_dir

    def make_pdb(self, pdb_path, chain_letters, overwrite=False, struct=None):
        """ Create a new PDB file containing only the specified chains.

        Returns the path to the created file.

        :param pdb_path: full path to the crystal structure
        :param chain_letters: iterable of chain characters (case insensitive)
        :param overwrite: write over the output file if it exists
        """
        chain_letters = [chain.upper() for chain in chain_letters]

        # Input/output files
        (pdb_dir, pdb_fn) = os.path.split(pdb_path)
        pdb_id = pdb_fn[3:7]
        out_name = "pdb%s_%s.ent" % (pdb_id, "".join(chain_letters))
        out_path = os.path.join(self.out_dir, out_name)
        print ("OUT PATH:",out_path)
        plural = "s" if (len(chain_letters) > 1) else ""  # for printing

        # Skip PDB generation if the file already exists
        if (not overwrite) and (os.path.isfile(out_path)):
            print("Chain%s %s of '%s' already extracted to '%s'." %
                    (plural, ", ".join(chain_letters), pdb_id, out_name))
            return out_path

        print("Extracting chain%s %s from %s..." % (plural,
                ", ".join(chain_letters),  pdb_fn))

        # Get structure, write new file with only given chains
        if struct is None:
            struct = self.parser.get_structure(pdb_id, pdb_path)
        self.writer.set_structure(struct)
        self.writer.save(out_path, select=SelectChains(chain_letters))

        return out_path


class SelectChains(PDB.Select):
    """ Only accept the specified chains when saving. """
    def __init__(self, chain_letters):
        self.chain_letters = chain_letters

    def accept_chain(self, chain):
        return (chain.get_id() in self.chain_letters)


if __name__ == "__main__":
    """ Parses PDB id's desired chains, and creates new PDB structures. """
    import sys
    if not len(sys.argv) == 2:
        print ("Usage: $ python %s 'pdb.txt'" % __file__)
        sys.exit()

    pdb_textfn = sys.argv[1]

    pdbList = PDB.PDBList()
    splitter = ChainSplitter("/home/patrick/Desktop/chain_splitting")

    with open(pdb_textfn) as pdb_textfile:
        for line in pdb_textfile:
            pdb_id = line[:4].lower()
            chain = line[4]
            pdb_fn = pdbList.retrieve_pdb_file(pdb_id)
            splitter.make_pdb(pdb_fn, chain)

推荐答案

retrieve_pdb_file 具有可选参数 file_format .如果未提供任何信息,则PDB服务器将返回cif文件.Biopython的解析器需要一个PDB文件.

retrieve_pdb_file has the optional parameter file_format. When no information is provided, the PDB server returns cif files. Biopython's parser expects a PDB file.

您可以将行更改为

pdbList.retrieve_pdb_file(pdb_id, file_format='pdb')

您将获得一个PDB文件,其余代码将继续运行..

and you should get a PDB file and the rest of the code runs through..

这篇关于从PDB文件中仅提取我们需要的链的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆