SQLAlchemy与辅助表连接行为之间的关系在延迟加载和急切加载之间发生了变化 [英] SQLAlchemy relationship with secondary table joining behaviour changes between lazy and eager loading

查看：95 发布时间：2020/10/16 1:17:10 python sql-server database sqlite sqlalchemy

本文介绍了SQLAlchemy与辅助表连接行为之间的关系在延迟加载和急切加载之间发生了变化的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我已经使用SQL Alchemy几个月了，到目前为止，它给我留下了深刻的印象。

I've been playing with SQL Alchemy for a couple of months now and so far been really impressed with it.

我遇到了一个问题现在这似乎是一个错误，但是我不确定自己在做正确的事情。我们在这里使用MS SQL，通过表反射来定义表类，但是我可以使用内存SQLite数据库来复制问题，我在此处包含了代码。

There is one issue I've run into now that seems to be a bug, but I'm not sure that I'm doing the right thing. We use MS SQL here, with table reflection to define the table classes, however I can replicate the problem using an in-memory SQLite database, code for which I have included here.

如果我定义了与延迟加载的关系，那么

What I am doing is defining a many to many relationship between two tables using a linking table between them. There is one extra piece of information that the linking table contains which I want to use for filtering the links, requiring the use of a primaryjoin statement on the relationship. This works perfectly for lazy loading, however for performance reasons we need eager loading and thats where it all falls over.

If I define the relationship with lazy loading:

activefunds = relationship('Fund', secondary='fundbenchmarklink',
                           primaryjoin='and_(FundBenchmarkLink.isactive==True,'
                                       'Benchmark.id==FundBenchmarkLink.benchmarkid,'
                                       'Fund.id==FundBenchmarkLink.fundid)')

并正常查询数据库：

query = session.query(Benchmark)

我需要的行为正是我想要的，尽管由于额外的SQL性能确实很差在遍历所有基准及其各自的资金时查询。

The behaviour I need is exactly what I want, though performance is really bad, due to the extra SQL queries when iterating through all of the benchmarks and their respective funds.

如果我定义了急切加载的关系：

If I define the relationship with eager loading:

activefunds = relationship('Fund', secondary='fundbenchmarklink',
                           primaryjoin='and_(FundBenchmarkLink.isactive==True,'
                                       'Benchmark.id==FundBenchmarkLink.benchmarkid,'
                                       'Fund.id==FundBenchmarkLink.fundid)',
                           lazy='joined')

和正常查询数据库：

query = session.query(Benchmark)

它在我的脸上炸毁：

sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) no such column: fund.id 
    [SQL: 'SELECT benchmark.id AS benchmark_id,
                   benchmark.name AS benchmark_name,
                   fund_1.id AS fund_1_id,
                   fund_1.name AS fund_1_name,
                   fund_2.id AS fund_2_id,
                   fund_2.name AS fund_2_name 
            FROM benchmark
            LEFT OUTER JOIN (fundbenchmarklink AS fundbenchmarklink_1
                             JOIN fund AS fund_1 ON fund_1.id = fundbenchmarklink_1.fundid) ON benchmark.id = fundbenchmarklink_1.benchmarkid
            LEFT OUTER JOIN (fundbenchmarklink AS fundbenchmarklink_2
                             JOIN fund AS fund_2 ON fund_2.id = fundbenchmarklink_2.fundid) ON fundbenchmarklink_2.isactive = 1
            AND benchmark.id = fundbenchmarklink_2.benchmarkid
            AND fund.id = fundbenchmarklink_2.fundid']

上面的SQL清楚地表明，在尝试从链接表访问列之前，未联接该链接表。

The SQL above clearly shows the linked table is not being joined before attempting to access columns from it.

如果我查询数据库，特别是加入链接表：

If I query the DB, specifically joining the linked table:

query = session.query(Benchmark).join(FundBenchmarkLink, Fund, isouter=True)

它有效，但是这意味着我现在必须确保每当查询基准表时，我总是必须定义联接以添加两个额外的表。

It works, however it means I now have to make sure that whenever I query the Benchmark table, I always have to define the join to add both of the extra tables.

我是否缺少某些东西，这是一个潜在的错误，还是仅仅是库的方式

Is there something I'm missing, is this a potential bug, or is it simply the way the library works?

完整的示例代码可以复制问题：

Full working sample code to replicate issue:

import logging

logging.basicConfig(level=logging.INFO)
logging.getLogger('sqlalchemy.engine.base').setLevel(logging.INFO)

from sqlalchemy import Column, DateTime, String, Integer, Boolean, ForeignKey, create_engine
from sqlalchemy.orm import relationship, sessionmaker
from sqlalchemy.ext.declarative import declarative_base

Base = declarative_base()


class FundBenchmarkLink(Base):
    __tablename__ = 'fundbenchmarklink'

    fundid = Column(Integer, ForeignKey('fund.id'), primary_key=True, autoincrement=False)
    benchmarkid = Column(Integer, ForeignKey('benchmark.id'), primary_key=True, autoincrement=False)
    isactive = Column(Boolean, nullable=False, default=True)

    fund = relationship('Fund')
    benchmark = relationship('Benchmark')

    def __repr__(self):
        return "<FundBenchmarkLink(fundid='{}', benchmarkid='{}', isactive='{}')>".format(self.fundid, self.benchmarkid, self.isactive)


class Benchmark(Base):
    __tablename__ = 'benchmark'

    id = Column(Integer, primary_key=True)
    name = Column(String, nullable=False)

    funds = relationship('Fund', secondary='fundbenchmarklink', lazy='joined')

    # activefunds has additional filtering on the secondary table, requiring a primaryjoin statement.
    activefunds = relationship('Fund', secondary='fundbenchmarklink',
                               primaryjoin='and_(FundBenchmarkLink.isactive==True,'
                                           'Benchmark.id==FundBenchmarkLink.benchmarkid,'
                                           'Fund.id==FundBenchmarkLink.fundid)',
                               lazy='joined')

    def __repr__(self):
        return "<Benchmark(id='{}', name='{}')>".format(self.id, self.name)


class Fund(Base):
    __tablename__ = 'fund'

    id = Column(Integer, primary_key=True)
    name = Column(String, nullable=False)

    def __repr__(self):
        return "<Fund(id='{}', name='{}')>".format(self.id, self.name)


if '__main__' == __name__:
    engine = create_engine('sqlite://')
    Base.metadata.create_all(engine)
    maker = sessionmaker(bind=engine)

    session = maker()

    # Create some data
    for bmkname in ['foo', 'bar', 'baz']:
        bmk = Benchmark(name=bmkname)
        session.add(bmk)

    for fname in ['fund1', 'fund2', 'fund3']:
        fnd = Fund(name=fname)
        session.add(fnd)

    session.add(FundBenchmarkLink(fundid=1, benchmarkid=1))
    session.add(FundBenchmarkLink(fundid=2, benchmarkid=1))
    session.add(FundBenchmarkLink(fundid=1, benchmarkid=2))
    session.add(FundBenchmarkLink(fundid=2, benchmarkid=2, isactive=False))

    session.commit()

    # This code snippet works when activefunds doesn't exist, or doesn't use eager loading
    # query = session.query(Benchmark)
    # print(query)

    # for bmk in query:
    #     print(bmk)
    #     for fund in bmk.funds:
    #         print('\t{}'.format(fund))

    # This code snippet works for activefunds with eager loading
    query = session.query(Benchmark).join(FundBenchmarkLink, Fund, isouter=True)
    print(query)

    for bmk in query:
        print(bmk)
        for fund in bmk.activefunds:
            print('\t{}'.format(fund))

SQLAlchemy与辅助表连接行为之间的关系在延迟加载和急切加载之间发生了变化 [英] SQLAlchemy relationship with secondary table joining behaviour changes between lazy and eager loading

问题描述

推荐答案

相关文章

数据库最新文章

热门教程

热门工具

登录关闭

SQLAlchemy与辅助表连接行为之间的关系在延迟加载和急切加载之间发生了变化 [英] SQLAlchemy relationship with secondary table joining behaviour changes between lazy and eager loading

问题描述

推荐答案

相关文章

数据库最新文章

热门教程

热门工具

登录 关闭

登录关闭