提取集合子集的策略 [英] Strategy to fetch a subset of a collection
问题描述
我有一种情况,我的应用程序可以在有限的时间窗口内访问会话,在此期间,它必须从数据库中将数据提取到内存中,然后仅使用内存中的数据来处理请求.
I have a scenario where my application has access to a session for limited time windows, during which it must fetch data from the database into memory, and then only use the in-memory data to serve requests.
数据模型是一个简单的一对多关联,例如:
The data model is a simple one-to-many association such as:
<class name="com.foo.Road" table="road">
<id name="oid" column="oid"/>
<map name="carCountMap" fetch="subselect">
<key column="road_oid" foreign-key="oid"/>
<index column="time_oid" type="long" />
<one-to-many class="com.foo.CarCount" />
</map>
<map name="truckCountMap" fetch="subselect">
<key column="road_oid" foreign-key="oid"/>
<index column="time_oid" type="long" />
<one-to-many class="com.foo.TruckCount" />
</map>
</class>
现在假设汽车和卡车的计数数据已经存在了几年,远远超出了内存中的容量.此外,我只对最近3个月的载车量感兴趣.
Now suppose that the car and truck count data exists for several years, which is much more than can fit in memory. Furthermore, I'm only really interested in loading the car counts in the last 3 months.
我的问题是,使用休眠方式加载数据的最佳方法是什么:
My question is, what is the best way to load this data using hibernate such that:
- road.getCarCountMap()仅返回最近3个月内的汽车计数(可能为空)的集合
- 我最终并没有得到一些需要花很多时间才能处理的疯狂的笛卡尔积
- 我关闭会话后没有抛出
LazyInitializationException
我尝试过的一些事情是:
Some things I've tried are:
1..急切地carCountMap
集合并在映射上指定一个where
属性,例如:
1. Make the carCountMap
collection eager and specify a where
attribute on the mapping such as:
<map name="carCountMap" fetch="subselect" lazy="false" where="time_oid > 1000">
(与truckCountMap
类似)
这最适合我想要的集合语义,但是不幸的是,它迫使我对值进行硬编码,因此我无法真正参考过去的3个月. time_oid
每天增加1.
This fits in best with the collection semantics that I want, but unfortunately it forces me to hardcode a value, so I can't really refer to the last 3 months. time_oid
increases by 1 every day.
2..将地图定义为惰性地图,并使用hql查询手动连接3个表:
2. Define maps as lazy and use an hql query to manually join the 3 tables:
from Road r
left outer join fetch r.carCountMap ccm
left outer join fetch r.truckCoutnMap tcm
where (ccm.time.oid > :startDate)
or (tcm.time.oid > :startDate)
此问题是,结果查询返回几百万行,而应该是1万条道路*每月(每周)4次测量* 3个月=〜120k.该查询大约在一个小时内完成,而方法1(就我而言,它加载的数据完全相同)在3分钟内完成,这是很荒谬的.
The problem with this is that the resulting query returns several millions of rows, whereas it should be 10k roads * 4 measurements per month (weekly) * 3 months = ~120k. This query completes in about an hour, which is ridiculous as approach #1 (which loads the exact same data as far as I'm concerned) completes in 3 minutes.
3..将地图定义为惰性地图,并首先使用标准加载道路,然后运行其他查询以填充集合
3. Define maps as lazy and load the roads first with a criteria, then run additional queries to fill in the collection
List roadList = session.createCriteria(Road.class).list();
session.getNamedQuery("fetchCcm").setLong("startDate", startDate).list();
session.getNamedQuery("fetchTcm").setLong("startDate", startDate).list();
return roadList;
这会触发正确的查询,但是检索到的汽车和卡车数量不会附加到roadList
中的Road
对象.因此,当我尝试访问任何Road对象的计数时,都会得到一个LazyInitializationException
.
This fires the right queries, but the retrieved car and truck counts don't get attached to the Road
objects in roadList
. So when I try to access the counts on any Road object, I get a LazyInitializationException
.
4..将地图定义为惰性地图,使用criteria.list()
加载所有道路,遍历最近3个月的所有测量日期,以便强制加载这些值.
4. Define maps as lazy, use criteria.list()
to load all roads, iterate through all measurement dates in the last 3 months so force those values to be loaded.
我还没有尝试过,因为它听起来很笨拙,而且我不相信它会摆脱LazyInitializationException
I haven't tried this yet because it sounds really clunky, and I'm not convinced it'll get rid of the LazyInitializationException
- 对于使用这些方法遇到的问题,是否有任何变通办法?
- 总共有更好的方法吗?
推荐答案
After digging around some more, it looks like hibernate filters are the exact solution I needed for this.
它们基本上提供了一种构造,该构造在集合或类上具有where
属性,并在运行时绑定了参数.
They basically provide a construct to have a where
attribute on a collection or class, with parameters bound at runtime.
在映射文件中,定义过滤器并将其附加到集合:
In the mapping file, define the filter and attach it to the collections:
<class name="com.foo.Road" table="road">
<id name="oid" column="oid"/>
<map name="carCountMap" fetch="subselect">
<key column="road_oid" foreign-key="oid"/>
<index column="time_oid" type="long" />
<one-to-many class="com.foo.CarCount" />
<filter name="byStartDate" condition="time_oid > :startDate" />
</map>
<map name="truckCountMap" fetch="subselect">
<key column="road_oid" foreign-key="oid"/>
<index column="time_oid" type="long" />
<one-to-many class="com.foo.TruckCount" />
<filter name="byStartDate" condition="time_oid > :startDate" />
</map>
</class>
<filter-def name="byStartDate">
<filter-param name="startDate" type="long"/>
</filter-def>
然后在dao中,启用过滤器,绑定参数并运行查询:
Then in the dao, enable the filter, bind the parameter and run the query:
session.enableFilter("byStartDate").setParameter("startDate", calculatedStartDateOid);
return session.createCriteria(Road.class).list();
这篇关于提取集合子集的策略的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!