在JPA上选择DISTINCT [英] Select DISTINCT on JPA

查看:203
本文介绍了在JPA上选择DISTINCT的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个表,其中包含 ISO 4217值(币种有6行,ID ,国家/地区,货币名称,字母代码,数字代码,次单位).

I have a table with the ISO 4217 values for currencies (With 6 rows, ID, Country, Currency_Name, Alphabetic_code, Numeric_Code, Minor_Unit).

我需要获取 4种最常用货币的一些数据,我的纯" SQL查询是这样的:

I need to get some of the data for the 4 most used currencies, and my "pure" SQL query goes like this:

select distinct currency_name, alphabetic_code, numeric_code 
from currency 
where ALPHABETIC_CODE IN ('USD','EUR','JPY','GBP') 
order by currency_name;

这将返回一个4行的表格,其中包含我需要的数据.到目前为止,一切都很好.

Which returns a 4-row table with the data I need. So far, so good.

现在,我必须将其转换为我们的JPA xml文件,然后问题就开始了.我要获取的查询是这样的:

Now I have to translate this to our JPA xml file, and the problems begin. The query I'm trying to get is like this:

SELECT DISTINCT c.currencyName, c.alphabeticCode, c.numericCode
FROM Currency c 
WHERE c.alphabeticCode IN ('EUR','GBP','USD','JPY') 
ORDER BY c.currencyName

这将返回一个列表,其中每个国家/地区使用一种或多种货币(好像查询中没有"DISTINCT"一样).我正在为之努力.因此问题将是:

This returns a list with one row for each country that has some of those currencies (As if there were no "DISTINCT" on the query). And I'm scratching my head on why. So the questions would be:

1)如何进行此查询以返回纯" SQL给我的内容?

1) How can I make this query to return what the "pure" SQL is giving me?

2)为什么此查询似乎忽略了我的"DISTINCT"子句?这里有些我想念的东西,我什么也没得到.这是怎么回事,我没有得到什么?

2) Why is this query seemingly ignoring my "DISTINCT" clause? There's something I'm missing here, and I don't get what. What's going on, what I'm not getting?

嗯,这越来越奇怪了.不知何故,该JPA查询按预期工作(返回4行).我已经尝试过了(因为我需要更多信息):

Well, this is getting weirder. Somehow, that JPA query works as intended (Returning 4 rows). I've tried this (As I needed some more info):

SELECT DISTINCT c.currencyName, c.alphabeticCode, c.numericCode, c.minorUnit, c.id
FROM Currency c 
WHERE c.alphabeticCode IN ('EUR','GBP','USD','JPY') 
ORDER BY c.currencyName

似乎ID弄乱了所有内容,因为从查询中删除ID会返回4行表.加上括号是没有用的.

And it seems the ID is messing everything, as removing it from the query goes back to return the 4-row table. And putting parenthesis is useless.

顺便说一句,我们正在使用eclipse链接.

btw, we are using eclipse link.

推荐答案

要回答您的问题,您编写的JPQL查询就可以了:

To answer your question, the JPQL query you wrote is just fine:

SELECT DISTINCT c.currencyName, c.alphabeticCode, c.numericCode
FROM Currency c 
WHERE c.alphabeticCode IN ('EUR','GBP','USD','JPY') 
ORDER BY c.currencyName

它应该转换为您期望的SQL语句:

And it should translate to the SQL statement you are expecting:

select distinct currency_name, alphabetic_code, numeric_code 
from currency 
where ALPHABETIC_CODE IN ('USD','EUR','JPY','GBP') 
order by currency_name;

根据基础JPQL或Criteria API查询类型,[DISTINCT] [1]在JPA中具有两种含义.

Depending on the underlying JPQL or Criteria API query type, [DISTINCT][1] has two meanings in JPA.

对于返回标量投影的标量查询,例如以下查询:

For scalar queries, which return a scalar projection, like the following query:

List<Integer> publicationYears = entityManager
.createQuery(
    "select distinct year(p.createdOn) " +
    "from Post p " +
    "order by year(p.createdOn)", Integer.class)
.getResultList();

LOGGER.info("Publication years: {}", publicationYears);

DISTINCT关键字应该传递给基础SQL语句,因为我们希望数据库引擎在返回结果集之前过滤重复项:

The DISTINCT keyword should be passed to the underlying SQL statement because we want the DB engine to filter duplicates prior to returning the result set:

SELECT DISTINCT
    extract(YEAR FROM p.created_on) AS col_0_0_
FROM
    post p
ORDER BY
    extract(YEAR FROM p.created_on)

-- Publication years: [2016, 2018]

实体查询

对于实体查询,DISTINCT具有不同的含义.

在不使用DISTINCT的情况下,进行如下查询:

Without using DISTINCT, a query like the following one:

List<Post> posts = entityManager
.createQuery(
    "select p " +
    "from Post p " +
    "left join fetch p.comments " +
    "where p.title = :title", Post.class)
.setParameter(
    "title", 
    "High-Performance Java Persistence eBook has been released!"
)
.getResultList();

LOGGER.info(
    "Fetched the following Post entity identifiers: {}", 
    posts.stream().map(Post::getId).collect(Collectors.toList())
);

将像这样联接postpost_comment表:

SELECT p.id AS id1_0_0_,
       pc.id AS id1_1_1_,
       p.created_on AS created_2_0_0_,
       p.title AS title3_0_0_,
       pc.post_id AS post_id3_1_1_,
       pc.review AS review2_1_1_,
       pc.post_id AS post_id3_1_0__
FROM   post p
LEFT OUTER JOIN
       post_comment pc ON p.id=pc.post_id
WHERE
       p.title='High-Performance Java Persistence eBook has been released!'

-- Fetched the following Post entity identifiers: [1, 1]

但是对于每个关联的post_comment行,其父post记录在结果集中都是重复的.因此,Post实体的List将包含重复的Post实体引用.

But the parent post records are duplicated in the result set for each associated post_comment row. For this reason, the List of Post entities will contain duplicate Post entity references.

要消除Post实体引用,我们需要使用DISTINCT:

To eliminate the Post entity references, we need to use DISTINCT:

List<Post> posts = entityManager
.createQuery(
    "select distinct p " +
    "from Post p " +
    "left join fetch p.comments " +
    "where p.title = :title", Post.class)
.setParameter(
    "title", 
    "High-Performance Java Persistence eBook has been released!"
)
.getResultList();
 
LOGGER.info(
    "Fetched the following Post entity identifiers: {}", 
    posts.stream().map(Post::getId).collect(Collectors.toList())
);

但是随后DISTINCT也被传递给SQL查询,这是完全不希望的:

But then DISTINCT is also passed to the SQL query, and that's not desirable at all:

SELECT DISTINCT
       p.id AS id1_0_0_,
       pc.id AS id1_1_1_,
       p.created_on AS created_2_0_0_,
       p.title AS title3_0_0_,
       pc.post_id AS post_id3_1_1_,
       pc.review AS review2_1_1_,
       pc.post_id AS post_id3_1_0__
FROM   post p
LEFT OUTER JOIN
       post_comment pc ON p.id=pc.post_id
WHERE
       p.title='High-Performance Java Persistence eBook has been released!'
 
-- Fetched the following Post entity identifiers: [1]

通过将DISTINCT传递给SQL查询,EXECUTION PLAN将执行额外的 Sort 阶段,该阶段会增加开销而不会带来任何价值,因为父子组合始终会返回唯一记录,因为子PK列的内容:

By passing DISTINCT to the SQL query, the EXECUTION PLAN is going to execute an extra Sort phase which adds an overhead without bringing any value since the parent-child combinations always return unique records because of the child PK column:

Unique  (cost=23.71..23.72 rows=1 width=1068) (actual time=0.131..0.132 rows=2 loops=1)
  ->  Sort  (cost=23.71..23.71 rows=1 width=1068) (actual time=0.131..0.131 rows=2 loops=1)
        Sort Key: p.id, pc.id, p.created_on, pc.post_id, pc.review
        Sort Method: quicksort  Memory: 25kB
        ->  Hash Right Join  (cost=11.76..23.70 rows=1 width=1068) (actual time=0.054..0.058 rows=2 loops=1)
              Hash Cond: (pc.post_id = p.id)
              ->  Seq Scan on post_comment pc  (cost=0.00..11.40 rows=140 width=532) (actual time=0.010..0.010 rows=2 loops=1)
              ->  Hash  (cost=11.75..11.75 rows=1 width=528) (actual time=0.027..0.027 rows=1 loops=1)
                    Buckets: 1024  Batches: 1  Memory Usage: 9kB
                    ->  Seq Scan on post p  (cost=0.00..11.75 rows=1 width=528) (actual time=0.017..0.018 rows=1 loops=1)
                          Filter: ((title)::text = 'High-Performance Java Persistence eBook has been released!'::text)
                          Rows Removed by Filter: 3
Planning time: 0.227 ms
Execution time: 0.179 ms

具有HINT_PASS_DISTINCT_THROUGH的实体查询

要从执行计划中消除排序"阶段,我们需要使用HINT_PASS_DISTINCT_THROUGH JPA查询提示:

Entity queries with HINT_PASS_DISTINCT_THROUGH

To eliminate the Sort phase from the execution plan, we need to use the HINT_PASS_DISTINCT_THROUGH JPA query hint:

List<Post> posts = entityManager
.createQuery(
    "select distinct p " +
    "from Post p " +
    "left join fetch p.comments " +
    "where p.title = :title", Post.class)
.setParameter(
    "title", 
    "High-Performance Java Persistence eBook has been released!"
)
.setHint(QueryHints.HINT_PASS_DISTINCT_THROUGH, false)
.getResultList();
 
LOGGER.info(
    "Fetched the following Post entity identifiers: {}", 
    posts.stream().map(Post::getId).collect(Collectors.toList())
);

现在,SQL查询将不包含DISTINCT,但是将删除Post实体引用重复项:

And now, the SQL query will not contain DISTINCT but Post entity reference duplicates are going to be removed:

SELECT
       p.id AS id1_0_0_,
       pc.id AS id1_1_1_,
       p.created_on AS created_2_0_0_,
       p.title AS title3_0_0_,
       pc.post_id AS post_id3_1_1_,
       pc.review AS review2_1_1_,
       pc.post_id AS post_id3_1_0__
FROM   post p
LEFT OUTER JOIN
       post_comment pc ON p.id=pc.post_id
WHERE
       p.title='High-Performance Java Persistence eBook has been released!'
 
-- Fetched the following Post entity identifiers: [1]

执行计划将确认我们这次不再具有额外的排序阶段:

And the Execution Plan is going to confirm that we no longer have an extra Sort phase this time:

Hash Right Join  (cost=11.76..23.70 rows=1 width=1068) (actual time=0.066..0.069 rows=2 loops=1)
  Hash Cond: (pc.post_id = p.id)
  ->  Seq Scan on post_comment pc  (cost=0.00..11.40 rows=140 width=532) (actual time=0.011..0.011 rows=2 loops=1)
  ->  Hash  (cost=11.75..11.75 rows=1 width=528) (actual time=0.041..0.041 rows=1 loops=1)
        Buckets: 1024  Batches: 1  Memory Usage: 9kB
        ->  Seq Scan on post p  (cost=0.00..11.75 rows=1 width=528) (actual time=0.036..0.037 rows=1 loops=1)
              Filter: ((title)::text = 'High-Performance Java Persistence eBook has been released!'::text)
              Rows Removed by Filter: 3
Planning time: 1.184 ms
Execution time: 0.160 ms

这篇关于在JPA上选择DISTINCT的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆