如何通过Python中的API查询在SQL Alchemy中插入关系数据(多对多) [英] How to insert relational data (many to many) in SQL Alchemy by means of API queries in Python

查看:27
本文介绍了如何通过Python中的API查询在SQL Alchemy中插入关系数据(多对多)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我做了一个 简短问题因为我觉得这个太长了,抱歉

I have made a short question because I think this one is too long, sorry

首先,我是数据库、编程语言等方面的新手……很抱歉,如果这个问题不那么恰当或具体,将不胜感激任何帮助或指导……

First of all, I am a newcomer to databases, programming languages and so on... so sorry if this question is not so proper nor specific, any help or guidance would be much appreciated...

我正在使用的上下文如下:我正在通过其 API 查询现有数据库,以便检索某些信息来设计我自己的数据库.

The context I am working with is the following: I am querying an existing database by means of its APIs in order to retrieve certain information to design my own database.

创建这个数据库的要点是,例如让用户引入一个基因,以了解它在生物体中的哪个位置是高 (UP) 或低 (DOWN) 表达,以及在哪个实验中看到了这种类型的表达.

The point to create this database is for example to let the user introduce a gene to know where in the organism it is over (UP) or under (DOWN) expressed, and in which experiment this type of expression has been seen.

目前,我正在做的只是查询现有数据库并解析json结果以获得每个生物体部分,所有过度或低表达的基因(以及我获得的每个基因)已报告该类型表达的实验)

For the time being, what I'm doing is just querying the existing database and parsing the json result to obtain for each organism part, all the genes that are over or under-expressed (and for each gene I obtain as well the experiments where that type of expression has been reported)

(在大脑中)

GENE1

Experiment1     UP
Experiment2     UP
Experiment3     UP
Experiment4     DOWN

基因2

Experiment5     DOWN
Experiment2     DOWN
Experiment3     DOWN
Experiment8     UP
Experiment9     DOWN

我认为我需要的不同表格是:基因"、器官"、实验"和表达类型"(以及genes2experiments2organs")

Different tables I think I will need are: "genes", "organs", "experiments" and "type of expression" (and "genes2experiments2organs")

考虑到一个基因可以在多个生物体_part中表达并且可以有不同类型的表达与多个实验相关,一个实验可以包含多个基因(多对多关系)

Take into account that one gene can be expressed in more than one organism_part and can have different types of expression related to more than one experiment, and one experiment can comprise more than one gene (many to many relationships)

我首先想知道的是如何添加关系数据,并知道我的尝试是否朝着正确的方向发展,或者我应该更改数据库的架构/想法...

What I would like know first is how to add relational data and know if my attempt is going to the right direction or should I change the schema/idea of the database...

我的第一次尝试是这样的:

My first attempt is this:

###########################################
DATABASE DEFINITION
###########################################

from sqlalchemy import create_engine, Column, Integer, String, Date, ForeignKey, Table, Float
from sqlalchemy.orm import sessionmaker, relationship, backref
from sqlalchemy.ext.declarative import declarative_base
import requests

Base = declarative_base()

Genes2experiments2organs = Table('genes2experiments2organs',Base.metadata,                                                                                                                                                                                                     
  Column('gene_id', String, ForeignKey('genes.id')),                                                                                                                                                                                                       
  Column('experiment_id', String, ForeignKey('experiments.id')),
  Column('organ_id', String, ForeignKey('organs.id'))
)

class Genes(Base):
    __tablename__ = 'genes'
    id = Column(String(45), primary_key=True)
    def __init__(self, id=""):
        self.id= id
    def __repr__(self):
        return "<genes(id:'%s')>" % (self.id)

class Experiments(Base):
    __tablename__ = 'experiments'
    id = Column(String(45), primary_key=True)
    experiments = relationship("Experiments", secondary=Genes2experiments2organs, backref="genes")
    organs = relationship("Organs", secondary=Genes2experiments2organs, backref="genes")
    def __init__(self, id=""):
        self.id= id
    def __repr__(self):
        return "<experiments(id:'%s')>" % (self.id)

class Organs(Base):
    __tablename__ = 'organs'
    id = Column(String(45), primary_key=True)
    def __init__(self, id=""):
        self.id= id
    def __repr__(self):
        return "<organs(id:'%s')>" % (self.id)

class Expression_type(Base):
    __tablename__ = 'expression_type'
    id = Column(String(45), primary_key=True)
    def __init__(self, id=""):
        self.id= id
    def __repr__(self):
        return "<expression_type(id:'%s')>" % (self.id)

#####################################################
INSERTING DATA
#####################################################

def setUp():
    global Session
    engine=create_engine('mysql://root:password@localhost/db_name?charset=utf8', pool_recycle=3600,echo=False)
    Session=sessionmaker(bind=engine)

def add_data():   ## I am just adding genes without taking into account the other related data to these genes.....
    session=Session()
    for i in range(0,1000,200):
        request= requests.get('http://www.ebi.ac.uk/gxa/api/v1',params={"updownInOrganism_part":"brain","rows":200,"start":i})
        result = request.json
        for item in result['results']:
            gene_to_add = item['gene']['ensemblGeneId']           
    session.commit()
    session.close()       


setUp()
add_data()
session=Session()
genes=session.query(Genes).all()
print "List of genes introduced:"
for gene in genes:
    print gene.id
session.close()

因此,使用此代码,我只填充了基因"表,但没有考虑与其他数据存在的关系,我将不得不包含在数据库中……执行此类操作的程序是什么,添加关系数据?还有一种避免插入重复基因的方法,例如在通过 API 查询填充表格时??

So, with this code I just populate "genes" table, but without taking into account the relationships that exist with the other data I will have to include in the database... what is the procedure to do such thing, adding relational data? And a way to avoid inserting duplicate genes for instance when populating the table by means of API queries??

顺便说一下,正如你所看到的,我没有把所有的多对多关系(次要)放在基因"表中,因为我不确定我是对还是完全错...谢谢

By the way, as you can see, I didn't put all many to many relationships (secondary), just in "genes" table because I am no sure if I am going right or completely wrong... thank you

推荐答案

这应该可以满足您的需求...

This should do what you are looking for ...

from sqlalchemy import (Column, create_engine, Integer, ForeignKey, Unicode,
                        Enum)
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker, relationship

Base = declarative_base()

class Gene(Base):
    __tablename__ = 'gene'

    id = Column(Integer, primary_key=True)
    name = Column(Unicode(64), unique=True)

    def __init__(self, name):
        self.name = name

class Experiment(Base):
    __tablename__ = 'experiment'

    id = Column(Integer, primary_key=True)

class Organ(Base):
    __tablename__ = 'organ'

    id = Column(Integer, primary_key=True)
    name = Column(Unicode(64), unique=True)

    def __init__(self, name):
        self.name = name

class Measurement(Base):
    __tablename__ = 'measurement'

    id = Column(Integer, primary_key=True)
    experiment_id = Column(Integer, ForeignKey(Experiment.id))
    gene_id = Column(Integer, ForeignKey(Gene.id))
    organ_id = Column(Integer, ForeignKey(Organ.id))

    # Add your measured values here
    expression = Column(Enum('UP', 'DOWN'))
    # ...

    experiment = relationship(Experiment, backref='measurements')
    gene = relationship(Gene, backref='measurements')
    organ = relationship(Organ, backref='measurements')

    def __repr__(self):
        return 'Experiment %d: %s, %s, %s' % (self.experiment.id,
                         self.gene.name, self.organ.name, self.expression)

if __name__ == '__main__':
    engine = create_engine('sqlite://')
    session = sessionmaker(engine)()
    Base.metadata.create_all(engine)

    #
    # Creating the data
    #

    x = Gene('Gene X')
    y = Gene('Gene Y')
    z = Gene('Gene Z')

    heart = Organ('Heart')
    lungs = Organ('Lungs')
    brain = Organ('Brain')

    session.add_all([x, y, z, heart, lungs, brain])
    session.commit()

    experiment_1 = Experiment()
    experiment_1.measurements.extend(
            [Measurement(gene_id=x.id, organ_id=heart.id, expression='UP'),
             Measurement(gene_id=x.id, organ_id=lungs.id, expression='UP'),
             Measurement(gene_id=x.id, organ_id=brain.id, expression='DOWN'),
             Measurement(gene_id=y.id, organ_id=brain.id, expression='UP'),
             Measurement(gene_id=z.id, organ_id=brain.id, expression='DOWN')])

    experiment_2 = Experiment()
    experiment_2.measurements.extend(
            [Measurement(gene_id=y.id, organ_id=lungs.id, expression='UP'),
             Measurement(gene_id=y.id, organ_id=lungs.id, expression='UP'),
             Measurement(gene_id=y.id, organ_id=brain.id, expression='UP'),
             Measurement(gene_id=x.id, organ_id=brain.id, expression='UP'),
             Measurement(gene_id=z.id, organ_id=heart.id, expression='UP')])

    session.add_all([experiment_1, experiment_2])
    session.commit()

    #
    # Querying the data
    #

    print('All measurements in the first experiment')
    experiment = session.query(Experiment).filter(Experiment.id == 1).one()
    for measurement in experiment.measurements:
        print(measurement)
    print('')

    print('All measurements of Gene X')
    gene_x = session.query(Gene).filter(Gene.name == 'Gene X').one()
    for measurement in gene_x.measurements:
        print(measurement)
    print('')

    print('All measurements of the brain')
    the_brain = session.query(Organ).filter(Organ.name == 'Brain').one()
    for measurement in the_brain.measurements:
        print(measurement)
    print('')

这篇关于如何通过Python中的API查询在SQL Alchemy中插入关系数据(多对多)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆