PostgreSQL-每个ID仅选择1行 [英] PostgreSQL - Select only 1 row for each ID

查看:152
本文介绍了PostgreSQL-每个ID仅选择1行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

情况

我正在旅行引擎网站上工作,并且正在编写一个复杂的查询,以根据 IP地址目的地日期,这样我以后就可以计算出转化率.

I am working on a travel engine site and am writing a complex query to match visitors' search queries with their bookings based on IP Address, Destination and Date so I can work out the conversion ratio later.

问题

根据参数需要有多种转化率(在这种情况下,是我从存储在搜索表中的 RequestUrl 中提取的 utm_source ).问题是有些用户从不同的位置进行了多次搜索..有时我们在请求中得到 utm_source ,有时没有...,当然,我们只需要匹配1个预订即可.请参阅下面的查询结果屏幕截图,以更好地了解:

There needs to be multiple conversion ratios based on a parameter (In this case, the utm_source which I extract from a RequestUrl stored in the searches table). The problem is some users make multiple searches from different locations.. sometimes we get the utm_source in the request and sometimes not... and of course we need to match to only 1 booking. See screenshot of query result below to better understand:

请参阅第三行和第四行具有相同的预订ID,等等.但是列的值不同.我只需要选择其中之一,而不是两者都选.基本上,如果大于1,则需要选择非"N/A"的1.

See the 3rd and 4th rows have the same booking ID, etc.. but different values for the Value column. I need to select only 1 of these, but not both. Basically, if there is more than 1, I need to choose the 1 that is not "N/A".

我的查询:

SELECT DISTINCT "B"."Id" AS "BookingId", "PQ"."IPAddress", "PQ"."To", "PQ"."SearchDate", "PQ"."Value"
FROM
(
    SELECT DISTINCT "IPAddress", "To", "CreatedAt"::date AS "SearchDate", COALESCE(SUBSTRING("RequestUrl", 'utm_source=([^&]*)'), 'N/A') AS "Value"
    FROM dbo."PackageQueries"
    WHERE "SiteId" = '<The ID>'
    AND "CreatedAt" >= '<Start Date>'
    AND "CreatedAt" < '<End Date>'
) AS "PQ"
INNER JOIN dbo."Bookings" AS "B"
    ON "PQ"."IPAddress" = "B"."IPAddress"
    AND "B"."To" = "PQ"."To"
    AND "B"."BookingDate"::date = "PQ"."SearchDate"
WHERE "B"."SiteId" = '<The ID>'
AND "B"."BookingStatus" = 2
AND "B"."BookingDate" >= '<Start Date>'
AND "B"."BookingDate" < '<End Date>'
ORDER BY "B"."Id", "PQ"."IPAddress", "PQ"."To";

推荐答案

我找到了一个解决方案,并基于在这里找到的解决方案:在ORDER BY中使用别名的Postgres案例

I found a solution and based it on what I found here: Return rows that are max of one column in Postgresql and here: Postgres CASE in ORDER BY using an alias

我的解决方法如下:

SELECT "BookingId", "IPAddress", "To", "SearchDate", "Value"
FROM
(
    SELECT DISTINCT
        "B"."Id" AS "BookingId",
        "PQ"."IPAddress",
        "PQ"."To",
        "PQ"."SearchDate",
        "PQ"."Value",
        RANK() OVER
        (
            PARTITION BY "B"."Id"
            ORDER BY
            CASE
                WHEN "PQ"."Value" = 'N/A' THEN 1
                ELSE 0
            END
        ) AS "RowNumber"
    FROM
    (
        SELECT DISTINCT "IPAddress", "To", "CreatedAt"::date AS "SearchDate", COALESCE(SUBSTRING("RequestUrl", 'utm_source=([^&]*)'), 'N/A') AS "Value"
        FROM dbo."PackageQueries"
        WHERE "SiteId" = '<Site ID>'
        AND "CreatedAt" >= '<Start Date>'
        AND "CreatedAt" < '<End Date>'
    ) AS "PQ"
    INNER JOIN dbo."Bookings" AS "B"
        ON "PQ"."IPAddress" = "B"."IPAddress"
        AND "B"."To" = "PQ"."To"
        AND "B"."BookingDate"::date = "PQ"."SearchDate"
    WHERE "B"."SiteId" = '<Site ID>'
    AND "B"."BookingStatus" = 2
    AND "B"."BookingDate" >= '<Start Date>'
    AND "B"."BookingDate" < '<End Date>'
) T
WHERE "RowNumber" = 1
ORDER BY "BookingId", "IPAddress", "To";

有些long绕,但效果很好.希望对其他人有帮助.

Somewhat long-winded, but it does the trick nicely. I hope it helps others.

编辑

这还不是故事的结局:在某些情况下,我获得的价值不止1.答案是修改CASE语句以为每个文本值生成一个唯一的数字.解决方案可在此处找到: PostgreSQL-在case语句中为字符串分配整数值

This wasn't the end of the story: there were still some cases in which I was getting more than 1 value. The answer was to modify the CASE statement to generate a unique number for each text value. The solution can be found here: PostgreSQL - Assign integer value to string in case statement

这篇关于PostgreSQL-每个ID仅选择1行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆