具有动态列数的 PostgreSQL 查询 [英] PostgreSQL query with dynamic number of columns

查看:93
本文介绍了具有动态列数的 PostgreSQL 查询的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图找到一种方法来返回具有动态列数的记录集.我可以编写一个查询来生成我需要的列名列表:

I am trying to find a method to return a record set with a dynamic number of columns. I can write one query that will produce the list of column names I need as such:

SELECT DISTINCT name FROM tests WHERE group = 'basic';

这将返回一个简短的列表,如poke"、prod"、hit"、drop"等.然后我想要生成一个表格,显示一系列测试,其中每个测试都在其中运行.每天早上,我们都会查看开发人员一直在做的事情,并对其进行检查和刺激,以便每天运行每个测试.我可以静态编写此查询:

This will return a short list like 'poke', 'prod', 'hit', 'drop', etc. Then I want a table produced showing a series of tests where each of those tests were run. Every morning we look at what the developers have been doing and poke and prod at it so each test will be run for each day. This query I can write statically:

SELECT (SELECT success FROM test_results AS i
        WHERE i.name = 'poke'
        AND i.date = o.date) AS 'poke',
       (SELECT success FROM test_results AS i
        WHERE i.name = 'prod'
        AND i.date = o.date) AS 'prod',
...
FROM test_results AS o GROUP BY date
HAVING date > now() - '1 week'::interval;

但是,这是硬编码到我们每天运行的测试中.如果我们现在需要每天开始踢设备,我们需要更新查询.如果我们决定不再需要跌落测试,一周后,跌落测试列应该从报告中删除,因为它不再出现在结果中.当只有某些日期有结果条目时,为丢失的测试返回 NULL 是完全可以接受的.

However, this is hard-coded to the tests we are running on each day. If we now need to start kicking the device each day, we need to update the query. If we decide the drop test is no longer needed, after a week, the drop test column should drop off the report as it no longer occurs in the results. Returning NULL for missing tests when only certain dates have a results entry is perfectly acceptable.

是否有一种方法可以通过在查询中仅使用常规 SQL 来根据结果创建动态列列表?

Is there a method to create a dynamic list of columns from the results by just using regular SQL in a query?

我试图通过使用 WITH 查询来构建我需要的部分数据,但我找不到从动态信息中正确构建最后一行的方法.

I was attempting to build up the data I need in parts by using a WITH query, but I can't find a way to build up the final row correctly from dynamic information.

以下是过去两天的一些示例数据:

Here's some sample data from the last two days:

CREATE TABLE test_results (
    name TEXT NOT NULL,
    date DATE default now() NOT NULL,
    success BOOLEAN NOT NULL
);

INSERT INTO test_results (name, date, success) VALUES ('hit',  '2017-06-20', TRUE);
INSERT INTO test_results (name, date, success) VALUES ('poke', '2017-06-20', TRUE);
INSERT INTO test_results (name, date, success) VALUES ('prod', '2017-06-20', TRUE);

INSERT INTO test_results (name, date, success) VALUES ('poke', '2017-06-21', TRUE);
INSERT INTO test_results (name, date, success) VALUES ('prod', '2017-06-21', TRUE);

INSERT INTO test_results (name, date, success) VALUES ('poke', '2017-06-22', TRUE);
INSERT INTO test_results (name, date, success) VALUES ('prod', '2017-06-22', FALSE);

INSERT INTO test_results (name, date, success) VALUES ('poke', '2017-06-23', TRUE);
INSERT INTO test_results (name, date, success) VALUES ('prod', '2017-06-23', TRUE);
INSERT INTO test_results (name, date, success) VALUES ('drop', '2017-06-23', TRUE);

如果我针对 2017-06-21 到 2017-06-23 的数据范围运行查询,我希望得到如下结果,包括当时运行的任何测试的矩阵:

If I run my query against the data range of 2017-06-21 to 2017-06-23, I'd like to get results like the following including a matrix of any tests that were run in that time:

date        | poke   | prod   | drop
------------+--------+--------+-----
2017-06-21  | TRUE   | TRUE   | NULL
2017-06-22  | TRUE   | FALSE  | NULL
2017-06-23  | TRUE   | TRUE   | TRUE

名称 poke、prod 和 drop 都是在该时间段内在一行的名称字段中找到的名称.对于没有该日期记录的任何测试的详细查询,将返回 NULL.

The names poke, prod, and drop were all names found in the name field of a row during that time period. NULL is returned for the detailed query for any tests that don't have a record for that date.

推荐答案

使用了不同的方法,有些已经在这里提到了,比如交叉表.此外,您可以构建一个自己的函数来动态构建查询并返回为 TABLE 和其他一些方法.

There are different methods used, some already mentioned here like crosstab. Also, you can build an own function that builds the query dynamically and returns as TABLE and few more methods.

但所有这些都要求您预先定义确切数量的输出及其数据类型.

But all require you to predefine an exact number of outputs and their data types.

如果我理解你的情况,你不会像你提到的那样想要:

If I understand your case that is something you would not want as you mentioned:

如果我们现在需要每天开始踢设备,我们需要更新查询.

If we now need to start kicking the device each day, we need to update the query.

使用交叉表和其他方式的缺点几乎相同.

Which is pretty much the same downside using crosstab and other ways.

所以有一种使用游标的方法.这可能不是最好的方法,如果您可以使用 crosstab 那么这可能会更好.
但至少这是一个我会在代码中添加注释的选项.

So there is a way using Cursors. It is probably not the best way to go and if you can use crosstab then that is probably better.
But at least it is an option I'll add with comments in code.

解决方案:

-- Function for opening cursor
CREATE OR REPLACE
FUNCTION    test_stats(
                c REFCURSOR,    -- cursor name
                sdate date,     -- start date of period wanted (included)
                edate date,     -- end date of period wanted (included)
                gtype text      -- you had in your 'tests' table some group type which I included just in case
            )
RETURNS     REFCURSOR
LANGUAGE    PLPGSQL
AS
$main$
BEGIN
    OPEN    c
    FOR
    -- Following dynamic query building can be
    -- used also if want to go with function that RETURNS TABLE
    EXECUTE format(
            '   SELECT  r.date,
                        %s
                FROM    test_results r
                WHERE   r.date BETWEEN %L AND %L
                GROUP BY 1
            ',
                -- Here we build for each 'name' own statement and 
                -- aggregate together with comma separator to feed
                -- into main query.
                -- P.S. We need to double check result unfortunately
                --      against test_results table once to get pre-filter
                --      for names in specified date range.
                --      With this we eliminate tests that for sure will
                --      not be presented in the range. In given test data
                --      this means eliminating 'hit'.
            (
                SELECT  string_agg(
                            DISTINCT format(
                                '(  SELECT  success
                                    FROM    test_results i
                                    WHERE   i.name = %1$L
                                    AND     i.date = r.date ) AS "%1$s"',
                                t.name
                            ),
                            ','
                        )
                FROM    tests t,
                LATERAL (   SELECT  array_agg( DISTINCT r.name )
                            FROM    test_results r
                            WHERE   r.date BETWEEN sdate AND edate
                        ) a( lst )
                WHERE   t.group = gtype     -- the group type is used here
                AND     t.name = ANY ( a.lst::text[] )
            ),
            sdate,      -- start date for between statement
            edate       -- end date for between statement
        );
    RETURN c;
END;
$main$;

-- Usage example:
BEGIN;
SELECT test_stats( 'teststats1', '2017-06-21'::date, '2017-06-23'::date, 'basic' );
FETCH ALL IN teststats1;
COMMIT;

-- Result (from your given test data set):
    date    | drop | poke | prod
------------+------+------+------
 2017-06-22 |      | t    | f
 2017-06-21 |      | t    | t
 2017-06-23 | t    | t    | t
(3 rows)

正如我所提到的,这不是完美的方式,但它可以完成工作:)

As I mentioned, it is not the perfect way, but it does the job :)

这篇关于具有动态列数的 PostgreSQL 查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆