筛选对具有多个值的crosstab()查询的结果产生意外影响 [英] Unexpected effect of filtering on result from crosstab() query with multiple values

查看:93
本文介绍了筛选对具有多个值的crosstab()查询的结果产生意外影响的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个crosstab()查询,与上一个问题类似:
过滤对结果的预期影响来自crosstab()查询

I have a crosstab() query similar to the one in my previous question:
Unexpected effect of filtering on result from crosstab() query

常见的情况是使用多个值extra1 IN(value1, value2...)过滤extra1字段.对于extra1过滤器中包含的每个值,我都添加了一个类似于(extra1 <> valueN)的排序表达式,如上面提到的帖子所示.结果查询如下:

The common case is to filter extra1 field with multiples values: extra1 IN(value1, value2...). For each value included on the extra1 filter, I have added an ordering expression like this (extra1 <> valueN), as appear on the above mentioned post. The resulting query is as follows:

SELECT *
FROM crosstab(
 'SELECT row_name, extra1, extra2..., another_table.category, value
  FROM   table t
  JOIN   another_table ON t.field_id = another_table.field_id
  WHERE  t.field = certain_value AND t.extra1 IN (val1, val2, ...) --> more values
  ORDER  BY row_name ASC, (extra1 <> val1), (extra1 <> val2)', ... --> more ordering expressions
 'SELECT category_name FROM category_name WHERE field = certain_value'
) AS ct(extra1, extra2...)
WHERE extra1 = val1; --> condition on the result

排序表达式value1中包含的extra1的第一个值,获取正确的结果行.但是,以下value2value3 ...的结果数错误,导致每个结果行较少.为什么呢?

The first value of extra1 included on the ordering expression value1, get the correct resulting rows. However, the following ones value2, value3..., get wrong number of results, resulting on less rows on each one. Why is that?

更新:

将此作为我们的源表(table t):

Giving this as our source table (table t):

+----------+--------+--------+------------------------+-------+
| row_name | Extra1 | Extra2 | another_table.category | value |
+----------+--------+--------+------------------------+-------+
| Name1    | 10     | A      | 1                      | 100   |
| Name2    | 11     | B      | 2                      | 200   |
| Name3    | 12     | C      | 3                      | 150   |
| Name2    | 11     | B      | 3                      | 150   |
| Name3    | 12     | C      | 2                      | 150   |
| Name1    | 10     | A      | 2                      | 100   |
| Name3    | 12     | C      | 1                      | 120   |
+----------+--------+--------+------------------------+-------+

这是我们的类别表:

+-------------+--------+
| category_id | value  |
+-------------+--------+
| 1           | Cat1   |
| 2           | Cat2   |
| 3           | Cat3   |
+-------------+--------+

使用CROSSTAB的想法是得到一个像这样的表:

Using the CROSSTAB, the idea is to get a table like this:

+----------+--------+--------+------+------+------+
| row_name | Extra1 | Extra2 | cat1 | cat2 | cat3 |
+----------+--------+--------+------+------+------+
| Name1    | 10     | A      | 100  | 100  |      |
| Name2    | 11     | B      |      | 200  | 150  |
| Name3    | 12     | C      | 120  | 150  | 150  |
+----------+--------+--------+------+------+------+

这个想法是要能够过滤结果表,这样我就可以得到Extra1列的结果,该列的值是1011,如下所示:

The idea is to be able to filter the resulting table so I get results with Extra1 column with values 10 or 11, as follow:

+----------+--------+--------+------+------+------+
| row_name | Extra1 | Extra2 | cat1 | cat2 | cat3 |
+----------+--------+--------+------+------+------+
| Name1    | 10     | A      | 100  | 100  |      |
| Name2    | 11     | B      |      | 200  | 150  |
+----------+--------+--------+------+------+------+

问题是,在我的查询中,对于Extra1,以10作为值,以及Extra1,以11作为值,我得到了不同的结果大小.使用(Extra1 <> 10)可以在Extra1上获得该值的正确结果大小,但在11作为值的情况下无法得到.

The problem is that on my query, I get different result size for Extra1 with 10 as value and Extra1 with 11 as value. With (Extra1 <> 10) I can get the correct result size on Extra1 for that value but not in the case of 11 as value.

这是一个小提琴,更详细地说明了这个问题:

Here is a fiddle demonstrating the problem in more detail:

https://dbfiddle.uk/?rdbms=postgres_11&fiddle=5c401f7512d52405923374 a>

https://dbfiddle.uk/?rdbms=postgres_11&fiddle=5c401f7512d52405923374c75cb7ff04

推荐答案

所有额外"列都是从该组的第一行复制的(如

All "extra" columns are copied from the first row of the group (as pointed out in my previous answer)

使用以下方法进行过滤:

While you filter with:

.... WHERE extra1 = 'val1';

...在同一列上添加更多的ORDER BY表达式是没有意义的.只有其源组中至少有一个extra1 = 'val1'的行才能保留.

...it makes no sense to add more ORDER BY expressions on the same column. Only rows that have at least one extra1 = 'val1' in their source group survive.

从您的各种评论中,我想您可能希望看到所有分别存在的extra值-在WHERE子句中过滤的集合内-对于同一unixdatetime.如果是这样,请在之前进行汇总.喜欢:

From your various comments, I guess you might want to see all distinct existing values of extra - within the set filtered in the WHERE clause - for the same unixdatetime. If so, aggregate before pivoting. Like:

SELECT * 
FROM   crosstab(
   $$
   SELECT unixdatetime, x.extras, c.name, s.value
   FROM  (
      SELECT unixdatetime, array_agg(extra) AS extras
      FROM  (
         SELECT DISTINCT unixdatetime, extra
         FROM   source_table   s
         WHERE  extra IN (1, 2)     -- condition moves here
         ORDER  BY unixdatetime, extra
         ) sub
      GROUP  BY 1
      ) x
   JOIN   source_table   s USING (unixdatetime)
   JOIN   category_table c ON c.id = s.gausesummaryid
   ORDER  BY 1
   $$
 , $$SELECT unnest('{trace1,trace2,trace3,trace4}'::text[])$$
) AS final_result (unixdatetime int
                 , extras int[]
                 , trace1 numeric
                 , trace2 numeric
                 , trace3 numeric
                 , trace4 numeric);

此外:以下有关第二功能参数的答案也适用于您的情况:

Aside: advice given in the following related answer about the 2nd function parameter applies to your case as well:

我在上面演示了一个静态的第二参数查询.在使用它时,您根本不需要加入category_table.相同,但又短又快,但是:

I demonstrate a static 2nd parameter query above. While being at it, you don't need to join to category_table at all. The same, a bit shorter and faster, yet:

SELECT * 
FROM   crosstab(
   $$
   SELECT unixdatetime, x.extras, s.gausesummaryid, s.value
   FROM  (
      SELECT unixdatetime, array_agg(extra) AS extras
      FROM  (
         SELECT DISTINCT unixdatetime, extra
         FROM   source_table
         WHERE  extra IN (1, 2)     -- condition moves here
         ORDER  BY unixdatetime, extra
         ) sub
      GROUP  BY 1
      ) x
   JOIN   source_table s USING (unixdatetime)
   ORDER  BY 1
   $$
,  $$SELECT unnest('{923,924,926,927}'::int[])$$
) AS final_result (unixdatetime int
                 , extras int[]
                 , trace1 numeric
                 , trace2 numeric
                 , trace3 numeric
                 , trace4 numeric);

db<>小提琴此处 -已添加我的查询在您的小提琴底部.

db<>fiddle here - added my queries at the bottom of your fiddle.

这篇关于筛选对具有多个值的crosstab()查询的结果产生意外影响的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆