图中的微笑 [英] SMILES from graph

查看:117
本文介绍了图中的微笑的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否存在将图(或邻接矩阵)转换为SMILES字符串的方法或程序包?

Is there a method or package that converts a graph (or adjacency matrix) into a SMILES string?

例如,我知道原子是[6 6 7 6 6 6 6 8] ([C C N C C C C O]),邻接矩阵是

For instance, I know the atoms are [6 6 7 6 6 6 6 8] ([C C N C C C C O]), and the adjacency matrix is

[[ 0.,  1.,  0.,  0.,  0.,  0.,  0.,  0.],

 [ 1.,  0.,  2.,  0.,  0.,  0.,  0.,  1.],

 [ 0.,  2.,  0.,  1.,  0.,  0.,  0.,  0.],

 [ 0.,  0.,  1.,  0.,  1.,  0.,  0.,  0.],

 [ 0.,  0.,  0.,  1.,  0.,  1.,  0.,  0.],

 [ 0.,  0.,  0.,  0.,  1.,  0.,  1.,  1.],

 [ 0.,  0.,  0.,  0.,  0.,  1.,  0.,  0.],

 [ 0.,  1.,  0.,  0.,  0.,  1.,  0.,  0.]]

我需要一些功能来输出'CC1=NCCC(C)O1'.

I need some function to output 'CC1=NCCC(C)O1'.

如果某些函数可以输出相应的"mol"对象,它也可以工作. RDkit软件具有'MolFromSmiles'功能.我想知道是否有类似'MolFromGraphs'的东西.

It also works if some function can output the corresponding "mol" object. The RDkit software has a 'MolFromSmiles' function. I wonder if there is something like 'MolFromGraphs'.

谢谢.

推荐答案

这是一个简单的解决方案,据我所知,RDKit中没有内置函数.

Here is a simple solution, to my knowledge there is no built-in function for this in RDKit.

def MolFromGraphs(node_list, adjacency_matrix):

    # create empty editable mol object
    mol = Chem.RWMol()

    # add atoms to mol and keep track of index
    node_to_idx = {}
    for i in range(len(node_list)):
        a = Chem.Atom(node_list[i])
        molIdx = mol.AddAtom(a)
        node_to_idx[i] = molIdx

    # add bonds between adjacent atoms
    for ix, row in enumerate(adjacency_matrix):
        for iy, bond in enumerate(row):

            # only traverse half the matrix
            if iy <= ix:
                continue

            # add relevant bond type (there are many more of these)
            if bond == 0:
                continue
            elif bond == 1:
                bond_type = Chem.rdchem.BondType.SINGLE
                mol.AddBond(node_to_idx[ix], node_to_idx[iy], bond_type)
            elif bond == 2:
                bond_type = Chem.rdchem.BondType.DOUBLE
                mol.AddBond(node_to_idx[ix], node_to_idx[iy], bond_type)

    # Convert RWMol to Mol object
    mol = mol.GetMol()            

    return mol
Chem.MolToSmiles(MolFromGraphs(nodes, a))

出局:
'CC1=NCCC(C)O1'

此解决方案是 https://github.com的简化版本./dakoner/keras-molecules/blob/dbbb790e74e406faa70b13e8be8104d9e938eba2/convert_rdkit_to_networkx.py

可能还需要设置许多其他原子属性(例如手性或质子化状态)和键类型(三重,导数...).最好尽可能在图表中明确跟踪这些内容(如上面的链接所示),但是如果需要,还可以扩展此功能以合并这些内容.

There are many other atom properties (such as Chirality or Protonation state) and bond types (Triple, Dative...) that may need to be set. It is better to keep track of these explicitly in your graph if possible (as in the link above), but this function can also be extended to incorporate these if required.

这篇关于图中的微笑的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆