如何通过“多”中的某个标准将基于一对多关联的查询结果分组? [英] How to group results from a query based on one-to-many association by some criterion in the "many"?

查看:110
本文介绍了如何通过“多”中的某个标准将基于一对多关联的查询结果分组?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

请原谅尴尬的标题。我很难把我的问题提成一个短语。如果有人能提出更好的方案,请放心。

Please forgive the awkward title. I had a hard time distilling my question into one phrase. If anyone can come up with a better one, feel free.

我有以下简化的方案:

vendors
  INT id

locations
  INT id
  INT vendor_id
  FLOAT latitude
  FLOAT longitude

我完全有能力返回最近的供应商列表,按接近程度排序,但受近似值限制半径:

I am perfectly capable of return a list of the nearest vendors, sorted by proximity, limited by an approximation of radius:

SELECT * FROM locations
WHERE latitude IS NOT NULL AND longitude IS NOT NULL
  AND ABS(latitude - 30) + ABS(longitude - 30) < 50
ORDER BY ABS(latitude - 30) + ABS(longitude - 30) ASC

目前,我无法解决订单/限价条款的重复问题。我最初尝试在 SELECT 字段中将其别名为 distance,但是psql告诉我 WHERE 子句。精细。如果周围有一些花哨的裤子,我会全神贯注,但是我的主要问题是:

I can't at this moment find my way around the repetition of the order/limit term. I initially attempted aliasing it as "distance" among the SELECT fields, but psql told me that this alias wasn't available in the WHERE clause. Fine. If there's some fancy pants way around this, I'm all ears, but on to my main question:

我想做的就是返回一列供应商列表,每个供应商都加入了最接近的位置,并且此列表按邻近程度排序,并受半径限制。

所以假设我有2个供应商,每个都有两个位置。我想要一个限制半径的查询,以使四个位置中只有一个位于其中,以便将该位置的关联供应商与供应商本身一起返回。如果半径涵盖了所有位置,我希望卖方1的位置之间最接近,而卖方2的位置之间最接近,最终根据卖方1和2的最接近位置对其排序。

So supposing I have 2 vendors, each with two locations. I want a query that limits the radius such that only one of the four locations is within it to return that location's associated vendor alongside the vendor itself. If the radius encompassed all the locations, I'd want vendor 1 presented with the closest between its locations and vendor 2 with the closest between its locations, ultimately ordering vendors 1 and 2 based on the proximity of their closest location.

在MySQL中,我设法通过使用 GROUP BY 然后选择来获取每个供应商行中最近的位置MIN(距离)。但是PostgreSQL似乎对 GROUP BY 的使用更为严格。

In MySQL, I managed to get the closest location in each vendor's row by using GROUP BY and then MIN(distance). But PostgreSQL seems to be stricter on the usage of GROUP BY.

如果可能的话,我想避免干预 SELECT 子句。如果可能的话,我也想重复使用上述查询的 WHERE ORDER 部分。但这绝对不是绝对要求。

I'd like to, if possible, avoid meddling with the SELECT clause. I'd also like to, if possible reuse the WHERE and ORDER parts of the above query. But these are by no means absolute requirements.

我在 DISTINCT ON GROUP BY ,但是这些给我带来了很多麻烦,主要是因为我在其他地方缺少镜像语句,而现在我将不做详细介绍。

I have made hackneyed attempts at DISTINCT ON and GROUP BY, but these gave me a fair bit of trouble, mostly in terms of me missing mirrored statements elsewhere, which I won't elaborate in great detail on now.

我最终采用了基于 OMG小马'很好的答案

SELECT vendors.* FROM (
  SELECT locations.*, 
    ABS(locations.latitude - 2.1) + ABS(locations.longitude - 2.1) AS distance,
    ROW_NUMBER() OVER(PARTITION BY locations.locatable_id, locations.locatable_type
      ORDER BY ABS(locations.latitude - 2.1) + ABS(locations.longitude - 2.1) ASC) AS rank
    FROM locations
    WHERE locations.latitude IS NOT NULL
    AND locations.longitude IS NOT NULL
    AND locations.locatable_type = 'Vendor'
  ) ranked_locations
INNER JOIN vendors ON vendors.id = ranked_locations.locatable_id
WHERE (ranked_locations.rank = 1)
  AND (ranked_locations.distance <= 0.5)
ORDER BY ranked_locations.distance;

与OMG Ponies解决方案有些偏差:

Some deviations from OMG Ponies' solution:


  • 位置现在通过 _type 多态关联。前提条件有所变化。

  • 我将联接移到了子查询之外。我不知道这是否会对性能产生影响,但是在我看来,将子查询看作是位置和分区排名的获得,然后将较大的查询看作是将所有要素放在一起的一种行为是很有意义的。

  • minor 取消了表名的别名。尽管我已经习惯于使用别名,但是这使我很难跟进。我将等到对PostgreSQL更有经验,然后再从事这种工作。

  • Locations are now polymorphically associated via _type. A bit of a premise change.
  • I moved the join outside the subquery. I don't know if there are performance implications, but it made sense in my mind to see the subquery as a getting of locations and partitioned rankings and then the larger query as an act of bringing it all together.
  • minor Took away table name aliasing. Although I'm plenty used to aliasing, it just made it harder for me to follow along. I'll wait until I'm more experienced with PostgreSQL before working in that flair.

推荐答案

对于PostgreSQL 8.4+,您可以使用类似ROW_NUMBER的分析

For PostgreSQL 8.4+, you can use analytics like ROW_NUMBER:

SELECT x.*
  FROM (SELECT v.*,
               t.*,
               ABS(t.latitude - 30) + ABS(t.longitude - 30) AS distance,
               ROW_NUMBER() OVER(PARTITION BY v.id
                                     ORDER BY ABS(t.latitude - 30) + ABS(t.longitude - 30)) AS rank
          FROM VENDORS v
          JOIN LOCATIONS t ON t.vendor_id = v.id
         WHERE t.latitude IS NOT NULL 
           AND t.longitude IS NOT NULL) x
  WHERE x.rank = 1
    AND x.distance < 50
ORDER BY x.distance

我保留了距离过滤功能,以防排名值超过50,因此不会出现供应商。如果您不希望这种情况发生,请删除小于50的距离检查。

I left the filtering on distance, in case the top ranked value was over 50 so the vendor would not appear. Remove the distance check being less than 50 portion if you don't want this to happen.

ROW_NUMBER将返回一个独特的顺序值,此示例中将为每个供应商重置该值。如果要复制,则需要使用DENSE_RANK。

ROW_NUMBER will return a distinct sequential value that resets for every vendor in this example. If you want duplicates, you'd need to look at using DENSE_RANK.

请参见本文用于在8.4之前的PostgreSQL上模拟ROW_NUMBER

这篇关于如何通过“多”中的某个标准将基于一对多关联的查询结果分组?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆