为每组选择行数,其中两个列值构成一组 [英] Select number of rows for each group where two column values makes one group

查看:33
本文介绍了为每组选择行数,其中两个列值构成一组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个由 UNION ALL 连接的 select 语句.在第一个语句中,where 子句仅收集先前已向用户显示的行.第二个语句收集所有未向用户显示的行,因此我首先得到查看的结果,然后得到未查看的结果.

I have a two select statements joined by UNION ALL. In the first statement a where clause gathers only rows that have been shown previously to the user. The second statement gathers all rows that haven't been shown to the user, therefore I end up with the viewed results first and non-viewed results after.

当然,这可以通过使用简单的 ORDER BY 的相同选择语句来实现,但是在您意识到我希望完成的事情后,两个单独选择的原因很简单.

Of course this could simply be achieved with the same select statement using a simple ORDER BY, however the reason for two separate selects is simple after you realize what I hope to accomplish.

考虑以下结构和数据.

+----+------+-----+--------+------+
| id | from | to  | viewed | data |
+----+------+-----+--------+------+
| 1  | 1    | 10  | true   | .... |
| 2  | 10   | 1   | true   | .... |
| 3  | 1    | 10  | true   | .... |
| 4  | 6    | 8   | true   | .... |
| 5  | 1    | 10  | true   | .... |
| 6  | 10   | 1   | true   | .... |
| 7  | 8    | 6   | true   | .... |
| 8  | 10   | 1   | true   | .... |
| 9  | 6    | 8   | true   | .... |
| 10 | 2    | 3   | true   | .... |
| 11 | 1    | 10  | true   | .... |
| 12 | 8    | 6   | true   | .... |
| 13 | 10   | 1   | false  | .... |
| 14 | 1    | 10  | false  | .... |
| 15 | 6    | 8   | false  | .... |
| 16 | 10   | 1   | false  | .... |
| 17 | 8    | 6   | false  | .... |
| 18 | 3    | 2   | false  | .... |
+----+------+-----+--------+------+

基本上我希望语句选择所有未查看的行,这是通过检查天气 viewed 列是 truefalse 来完成的>,非常简单明了,这里没有什么可担心的.

Basically I wish all non viewed rows to be selected by the statement, that is accomplished by checking weather the viewed column is true or false, pretty simple and straightforward, nothing to worry here.

然而,当涉及到已经查看的行时,意味着列 viewed 为 TRUE,对于这些记录,我只希望为每个组返回 3 行.

However when it comes to the rows already viewed, meaning the column viewed is TRUE, for those records I only want 3 rows to be returned for each group.

本例中的适当结果应该是每组最近的 3 行.

The appropriate result in this instance should be the 3 most recent rows of each group.

+----+------+-----+--------+------+
| id | from | to  | viewed | data |
+----+------+-----+--------+------+
| 6  | 10   | 1   | true   | .... |
| 7  | 8    | 6   | true   | .... |
| 8  | 10   | 1   | true   | .... |
| 9  | 6    | 8   | true   | .... |
| 10 | 2    | 3   | true   | .... |
| 11 | 1    | 10  | true   | .... |
| 12 | 8    | 6   | true   | .... |
+----+------+-----+--------+------+

正如您从理想结果集中看到的,我们分为三组.因此,查看结果的所需查询应为其找到的每个分组显示最多 3 行.在这种情况下,这些分组是 10 与 1 和 8 与 6,两者都有三行要显示,而另一个组 2 和 3 只显示一行.

As you see from the ideal result set we have three groups. Therefore the desired query for the viewed results should show a maximum of 3 rows for each grouping it finds. In this case these groupings were 10 with 1 and 8 with 6, both which had three rows to be shown, while the other group 2 with 3 only had one row to be shown.

请注意,from = xto = y 的分组与 from = y 的分组相同>to = x.因此,考虑第一个分组(10 加 1),如果 from = 1from = 10to = 1 是同一组to = 10.

Please note that where from = x and to = y, makes the same grouping as if it was from = y and to = x. Therefore considering the first grouping (10 with 1), from = 10 and to = 1 is the same group if it was from = 1 and to = 10.

但是,整个表中有很多组,我只希望在 select 语句中返回每个组中最近的 3 个,这就是我的问题,我不确定如何以最有效的方式实现考虑到该表在某些时候将有数百甚至数千条记录.

However there are plenty of groups in the whole table that I only wish the 3 most recent of each to be returned in the select statement, and thats my problem, I not sure how that can be accomplished in the most efficient way possible considering the table will have hundreds if not thousands of records at some point.

感谢您的帮助.

注意:idfromtoviewed 列是索引,这应该有助于提高性能.

Note: The columns id, from, to and viewed are indexed, that should help with performance.

PS:我不确定如何准确命名这个问题,如果您有更好的主意,请成为我的客人并编辑标题.

PS: I'm unsure on how to name this question exactly, if you have a better idea, be my guest and edit the title.

推荐答案

真是一团糟!当您从最近移动到第二最近,再到第三最近时,这会变得越来越困难.

What a hairball! This gets progressively harder as you move from most recent, to second most recent, to third most recent.

让我们通过获取我们需要的 ID 列表来将它们放在一起.然后我们可以通过 ID 从表中提取项目.

Let's put this together by getting the list of IDs we need. Then we can pull the items from the table by ID.

这个相对简单的查询让您获得最近商品的 ID

This, relatively easy, query gets you the ids of your most recent items

 SELECT id FROM
    (SELECT max(id) id, fromitem, toitem
       FROM stuff
      WHERE viewed = 'true'
      GROUP BY fromitem, toitem
    )a

小提琴:http://sqlfiddle.com/#!2/f7045/27/0

接下来,我们需要获取第二个最近项目的 id.为此,我们需要一个自联接样式的查询.我们需要做同样的总结,但在一个忽略最近项目的虚拟表上.

Next, we need to get the ids of the second most recent items. To do this, we need a self-join style query. We need to do the same summary but on a virtual table that omits the most recent items.

select id from (
  select max(b.id) id, b.fromitem, b.toitem
    from stuff a
    join
           (select id, fromitem, toitem
            from stuff
           where viewed = 'true'
            ) b on (    a.fromitem = b.fromitem 
                    and a.toitem = b.toitem
                    and b.id < a.id)
   where a.viewed = 'true'
   group by fromitem, toitem
  )c

小提琴:http://sqlfiddle.com/#!2/f7045/44/0

最后,我们需要获取第三个最近项目的 id.怜悯!我们需要将刚才的查询加入到表中.

Finally, we need to get the ids of the third most recent items. Mercy! We need to join that query we just had, to the table again.

select id from
(
  select max(d.id) id, d.fromitem, d.toitem
    from stuff d
     join 
    (
       select max(b.id) id, b.fromitem, b.toitem
          from stuff a
          join
            (
               select id, fromitem, toitem
                 from stuff
                where viewed = 'true'
            ) b on  (    a.fromitem = b.fromitem 
                     and a.toitem = b.toitem
                     and b.id < a.id)
          where a.viewed = 'true'
          group by fromitem, toitem
     ) c on (    d.fromitem = c.fromitem
             and d.toitem = c.toitem
             and d.id < c.id)
    where d.viewed='true'
  group by d.fromitem, d.toitem
 ) e

小提琴:http://sqlfiddle.com/#!2/f7045/45/0

所以,现在我们对所有这些 id 进行联合,并使用它们从表中获取正确的行,我们就完成了.

So, now we take the union of all those ids, and use them to grab the right rows from the table, and we're done.

SELECT * 
  FROM STUFF
 WHERE ID IN
(

SELECT id FROM
    (SELECT max(id) id, fromitem, toitem
       FROM stuff
      WHERE viewed = 'true'
      GROUP BY fromitem, toitem
    )a
UNION
select id from (
  select max(b.id) id, b.fromitem, b.toitem
    from stuff a
    join
           (select id, fromitem, toitem
            from stuff
           where viewed = 'true'
            ) b on (    a.fromitem = b.fromitem 
                    and a.toitem = b.toitem
                    and b.id < a.id)
   where a.viewed = 'true'
   group by fromitem, toitem
  )c
UNION
select id from
(
  select max(d.id) id, d.fromitem, d.toitem
    from stuff d
     join 
    (
       select max(b.id) id, b.fromitem, b.toitem
          from stuff a
          join
            (
               select id, fromitem, toitem
                 from stuff
                where viewed = 'true'
            ) b on  (    a.fromitem = b.fromitem 
                     and a.toitem = b.toitem
                     and b.id < a.id)
          where a.viewed = 'true'
          group by fromitem, toitem
     ) c on (    d.fromitem = c.fromitem
             and d.toitem = c.toitem
             and d.id < c.id)
    where d.viewed='true'
  group by d.fromitem, d.toitem
 ) e
UNION
select id from stuff where viewed='false'
)
order by viewed desc, fromitem, toitem, id desc

嘻嘻.SQL 太多了.小提琴:http://sqlfiddle.com/#!2/f7045/47/0

Tee hee. Too much SQL. Fiddle: http://sqlfiddle.com/#!2/f7045/47/0

现在,我们需要处理您的最后一个要求,即您的图是无序的.也就是说,from=n to=m 和 from=m to=n 是一样的.

And now, we need to cope with your last requirement, the requirement that your graph is unordered. That is, that from=n to=m is the same as from=m to=n.

为此,我们需要一个虚拟表而不是物理表.这将解决问题.

To do this we need a virtual table instead of the physical table. This will do the trick.

 SELECT id, least(fromitem, toitem) fromitem, greatest(fromitem,toitem) toitem, data
   FROM stuff

现在我们需要使用这个虚拟表,这个视图,到处都是以前出现的物理表.让我们使用一个视图来做到这一点.

Now we need to use this virtual table, this view, everywhere the physical table used to appear. Let's use a view to do this.

CREATE VIEW 
AS 
SELECT id,
       LEAST(fromitem, toitem) fromitem,
       GREATEST (fromitem, toitem) toitem,
       viewed,
       data;

所以,我们的最终查询是:

So, our ultimate query is:

SELECT *
      FROM stuff
     WHERE ID IN
    (

    SELECT id FROM
        (SELECT max(id) id, fromitem, toitem
           FROM STUFF_UNORDERED
          WHERE viewed = 'true'
          GROUP BY fromitem, toitem
        )a
    UNION
    SELECT id FROM (
      SELECT max(b.id) id, b.fromitem, b.toitem
        FROM STUFF_UNORDERED a
        JOIN
               (SELECT id, fromitem, toitem
                FROM STUFF_UNORDERED
               WHERE viewed = 'true'
                ) b ON (    a.fromitem = b.fromitem
                        AND a.toitem = b.toitem
                        AND b.id < a.id)
       WHERE a.viewed = 'true'
       GROUP BY fromitem, toitem
      )c
    UNION
    SELECT id FROM
    (
      SELECT max(d.id) id, d.fromitem, d.toitem
        FROM STUFF_UNORDERED d
         JOIN
        (
           SELECT max(b.id) id, b.fromitem, b.toitem
              FROM STUFF_UNORDERED a
              JOIN
                (
                   SELECT id, fromitem, toitem
                     FROM STUFF_UNORDERED
                    WHERE viewed = 'true'
                ) b ON  (    a.fromitem = b.fromitem
                         AND a.toitem = b.toitem
                         AND b.id < a.id)
              WHERE a.viewed = 'true'
              GROUP BY fromitem, toitem
         ) c ON (    d.fromitem = c.fromitem
                 AND d.toitem = c.toitem
                 AND d.id < c.id)
        WHERE d.viewed='true'
      GROUP BY d.fromitem, d.toitem
     ) e
    UNION
    SELECT id FROM STUFF_UNORDERED WHERE viewed='false'
    )
    ORDER BY viewed DESC,
            least(fromitem, toitem),
            greatest(fromitem, toitem),
            id DESC

小提琴:http://sqlfiddle.com/#!2/8c154/4/0

这篇关于为每组选择行数,其中两个列值构成一组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆