@BatchSize一个聪明或愚蠢的用途? [英] @BatchSize a smart or stupid use?

查看:118
本文介绍了@BatchSize一个聪明或愚蠢的用途?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

首先,我将解释如何理解和使用 @BatchSize
@BatchSize 为了批量加载对象关系,减少对数据库的SQL请求。这对于 LAZY @OneToMany 关系是特别有用的。



然而,它甚至对于 LAZY @OneToOne 关系和 @ManyToOne :如果您从数据库并要求加载一个懒惰的 @ * ToOne 实体,它会按批次加载实体,即使我只是使用一个测试来加载列表的第一个实体的关系。注意如果有些人想测试:这只会显示实体是否已经加载:例如,如果你有一个管理员用户列表和列出所有用户,当您将访问管理器,因为它已经被加载,所以不会触发请求。



我在该方法中看到的唯一缺点是如果您加载项目列表数据库,但只使用它的一部分。这是一个后过滤操作。



让我们来看看要点。



假设我使所有事情都做得很好,即使它让我执行原生SQL查询或使用DTO对象进行多选标准查询等,也不会执行后过滤式操作。


  1. 我是否正确地考虑到我可以在 @BatchSize 之后每个懒惰的关系仔细考虑使用急切的加载/加入,并最终选择一个懒惰的关系?

  2. 我是否有兴趣为 @BatchSize 寻找足够的值,或者我可以认为越大越好?这意味着在INSQL运算符中是否有任何数量限制,可以使我的请求足够慢而不再有价值?我使用Postgres,但是如果您有其他SGBD的答案,我也很感兴趣。 >
  3. 可选问题:似乎在一个类上使用 @BatchSize 不会产生很多结果,我仍然需要注释每个懒惰关系,我是否错过了一些关于它的内容,或者它是无用的?

编辑:我的3点是我变得不同行为。



假设我正在加载一个与A具有LAZY OneToMany关系的类A的实体列表。现在我想打印所有B的creationDate 。所以我正在做一个经典的2 for循环。



现在我用BatchSize注释B:


  • @OneToMany没有用BatchSize注释:每一组B都在每次迭代时独立加载而没有批处理,所以我对B类的注解似乎完全被忽略了。如果我将一个值设置为two,并且我有一个集合中有6个条目,那么我对该集合有一个查询。


    • @OneToMany注释:我有加载的批次的特定查询。如果我将批量大小修改为两个,并且总共有10个B accros,我只需要5个请求:无论我有多少A。如果我将它设置为100:我有1个B对象查询。



      PS:我没有考虑任何相关的查询B,可能会触发加载B / Fetch选择/子选择。



      编辑2:我刚刚发现这个职位为什么我不会在@BatchSize上使用每一个懒惰的加载关系?虽然我GOOGLE和谷歌搜索之前提出我的问题,猜我没有使用正确的话......



      然而,添加不同的东西可能会导致不同的答案:当我想知道在每个关系上使用BatchSize时,它是在选择是否需要加载,加入/选择获取或如果我想延迟加载后。

      解决方案


      1. 是的, @BatchSize 与懒惰的关联。

      2. Hibernate将执行multipl无论如何,即使未初始化的代理/集合的数量小于指定的批量大小,也无论如何,大多数情况下都是如此。有关更多详细信息,请参阅此答案。此外,与较小的查询相比,更轻的查询可能会对整个系统的吞吐量产生积极影响。
      3. 在课堂上,
      4. @BatchSize 意味着该实体的指定批量大小将应用于所有与该实体的 @ * ToOne 惰性关联。请参阅 Person 实体示例.html#performance-fetching-batchrel =nofollow noreferrer> documentation

      链接的问题/答案您提供的一般更关注优化和延迟加载的需要。它们当然也适用于这里,但它们只与批量加载无关,这只是其中一种可能的方法。

      另一个重要的事情涉及急切加载在链接的答案中提到,这表明如果一个属性总是被使用,那么你可以通过使用预先加载来获得更好的性能。这通常是不正确的集合,并且在许多情况下也适用于一对一关联。

      例如,假设您有以下实体当 cs 始终 always 时, / code>被使用。

        public class A {
      @OneToMany
      private Collection< B个BS;

      @OneToMany
      私人收藏< C> CS;

      急切地加载 bs cs 显然会遇到N + 1选择问题,如果您没有在单个查询中加入它们。但是,如果你加入他们在一个单一的查询,例如:

       选择一个从A 
      左连接获取一个.bs
      left join fetch a.cs

      然后创建完整的笛卡尔积 bs cs 并返回 count(a.bs)x count (a.cs)结果集中的行每个 a ,这些行被逐一读取并汇编成 A 的实体及其 bs cs 的集合。

      在这种情况下批量读取是非常理想的,因为您首先阅读 A s,然后 bs 然后是 cs ,导致更多的查询,但是从数据库传输的数据总量要少得多。另外,单独的查询比带连接的查询简单得多,数据库执行和优化更容易。


      First I'll explain how I understood and use @BatchSize : @BatchSize is made in order to load relations of objects in batch, making less SQL request to the database. This is specially usefull on LAZY @OneToMany relations.

      However it's even useful on LAZY @OneToOne relation and @ManyToOne : if you load a list of entities from the database and ask to load a lazyed @*ToOne entity, it will load the entities by batch even if i just use a test that load the relation of the 1st entity of the list.

      Note if some want to tests : This only show if the entities are not already loaded : for instance if you have a list of user with manager and list all users, when you will access to the manager, no request will be triggered since it's already loaded.

      The only drawback that i see on that method is if you load a list of item from the database but only use a part of it. This is a post-filtering operation.

      So let's get to the main point.

      Let's assume that i make everything good to never do post-filtering-like operations even if it's makes me do native SQL queries or use DTO objects for multiselect criteria query and so on.

      1. Am I right to consider that I can just @BatchSize every lazyed relations after having carefully think about using eager loading / join and finally choose a lazy relation ?
      2. Do i have any interest to search for an adequate value for the @BatchSize or can i think "the bigger the better" ? This would mean "is there any a limit of number in "IN" SQL operator that can make my request enough slower to not be worth anymore ? I use Postgres but if you have answers for others SGBD i'm interested too.
      3. Optional question : it seems that using @BatchSize on a class isn't producing a lot of results. I still have to annotate every lazy relationships, did i miss something about it or is it useless ?

      EDIT : The point of my 3 is that i'm getting a different behaviour.

      Let say i'm loading a list of entities of class "A" which has a LAZY OneToMany relationship to B. Now i want to print all creationDate of B. So i'm doing a classic 2 for loop.

      I annotated B with BatchSize now :

      • @OneToMany is not annotated with BatchSize : each set of B are loaded on each iteration independently without batching. So my annotation on B class seems to be totally ignored. Even if i set a value to "two" and i have 6 entries in one set, i have one query for that set.
        • @OneToMany is annotated : i have the specific query of batches that are loaded. If i fix the batch size to two and i have a total of 10 B accros i just get 5 requests : whatever the number of A i have. If i set it to 100 : i have 1 query for B objects.

      PS : i'm not considering any related query to B that might fire to load B fields with fetch select/subselect.

      EDIT 2 : i just found this post Why would I not use @BatchSize on every lazy loaded relationship? althought i googled and search on SO beforeposting my question, guess i didn't use the right words...

      However i'm adding something different that might lead to a different answer : when i'm wondering about using BatchSize on every relations, it's after choosing if i want a eager loading, with join / select fetch or if i want lazy loading.

      解决方案

      1. Yes, @BatchSize is meant to be used with lazy associations.
      2. Hibernate will execute multiple statements in most sitations anyway, even if the count of uninitialized proxies/collections is less than the specified batch size. See this answer for more details. Also, more lighter queries compared to less bigger ones may positively contribute to the overall throughput of the system.
      3. @BatchSize on class level means that the specified batch size for the entity will be applied for all @*ToOne lazy associations with that entity. See the example with the Person entity in the documentation.

      The linked question/answers you provided are more concerned about the need for optimization and lazy loading in general. They apply here as well of course, but they are not related to batch loading only, which is just one of the possible approaches.

      Another important thing relates to eager loading which is mentioned in the linked answers and which suggests that if a property is always used then you may get better performance by using eager loading. This is in general not true for collections and in many situations for to-one associations either.

      For example, suppose you have the following entity for which bs and cs are always used when A is used.

      public class A {
        @OneToMany
        private Collection<B> bs;
      
        @OneToMany
        private Collection<C> cs;
      }
      

      Eagerly loading bs and cs obviously suffers from N+1 selects problem if you don't join them in a single query. But if you join them in a single query, for example like:

      select a from A
        left join fetch a.bs
        left join fetch a.cs
      

      then you create full Cartesian product between bs and cs and returning count(a.bs) x count(a.cs) rows in the result set for each a which are read one by one and assembled into the entities of A and their collections of bs and cs.

      Batch fetching would be very optimal in this situation, because you would first read As, then bs and then cs, resulting in more queries but with much less total amount of data that is transferred from the database. Also, the separate queries are much simpler than a big one with joins and are easier for database to execute and optimize.

      这篇关于@BatchSize一个聪明或愚蠢的用途?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆