Sql查询以获取连接航班的src到dest路径 [英] Sql query to get src to dest path for connecting flights

查看:104
本文介绍了Sql查询以获取连接航班的src到dest路径的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想解决一个问题,我希望找到 src 到 dest 的飞行路径,包括转机航班,并在可能的情况下根据所需时间对其进行排序.

I want to solve a problem where I wish to find src to dest flight path including connecting flights and if possible sort it for time required.

我可以使用 listagg 之类的东西以某种方式将中间航班聚合为一个字符串吗.

Can I use something like listagg to aggregate intermediate flights as a string somehow.

我们可以将转机航班的数量和时间限制在一个数字内.

We could cap the number of connecting flights to a number and the time taken.

我现在就从这个开始,这让我可以转机

I have this as a start right now which gives me connecting flights

with  cap as (select 30 time_cap , 5 connect_cap), 
 connecting as 
    (select f1.src src1
          , f1.dest dest1
          , f1.stt st1
          , f1.endt end1
          , f2.src src2
          , f2.dest dest2
          , f2.stt st2
          , f2.endt end2
          , (f2.endt - f1.stt) as time 
     from flight f1 
     inner join flight f2 on f1.dest = f2.src 
     where f1.endt < f2.stt)

我的桌子现在看起来像这样

My table looks like this now

\d+ flight
                                  Table "public.flight"
 Column |            Type             | Modifiers | Storage  | Stats target | Description 
--------+-----------------------------+-----------+----------+--------------+-------------
 src    | character varying(20)       |           | extended |              | 
 dest   | character varying(20)       |           | extended |              | 
 stt    | timestamp without time zone |           | plain    |              | 
 endt   | timestamp without time zone |           | plain    |              | 

这是一道已经结束的面试题.

This was an interview question which already got over.

可以在 sql 查询中解决图形 bfs 类型的解决方案吗?

Could a graph bfs kind of solution be solved in an sql query?

即使是不工作的查询(伪代码 - 如果尝试就会工作)或方法也可以.

Even a non working query (pseudo code - which would work if tried) or approach would do.

在下面的查询中,我想在 string_agg 中找出一种方法,我可以在其中检查最后一个目的地是否是我要去的目的地.

In the below query, I want to figure out a way in string_agg where I can check if the last destination is the destination I want to go to.

with flight as (select f1.src||'-'||f1.dest||','||f2.src||'-'||f2.dest route
                     , f1.src src1
                     , f1.dest dest1
                     , f1.stt st1
                     , f1.endt end1
                     , f2.src src2
                     , f2.dest dest2
                     , f2.stt st2
                     , f2.endt end2
                     , (f2.endt - f1.stt) as time 
                from flight f1 
                inner join flight f2 on f1.dest = f2.src 
                where f1.endt < f2.stt) 

select string_agg(route,',') from flight ;

查询航班的输出

  route  | src1 | dest1 |         st1         |        end1         | src2 | dest2 |         st2         |        end2         |   time   
---------+------+-------+---------------------+---------------------+------+-------+---------------------+---------------------+----------
 a-b,b-c | a    | b     | 2017-05-17 09:31:56 | 2017-05-17 14:31:56 | b    | c     | 2017-05-17 15:31:56 | 2017-05-17 16:31:56 | 07:00:00

推荐答案

SQL 中的这些类型的树遍历问题可以使用特殊语法来解决 WITH RECURSIVE.

These types of tree-traversal problems in SQL can be solved using the special syntax WITH RECURSIVE.

为了测试解决方案,让我们用虚拟数据创建下表:

To test the solution, lets create the following table with dummy data:

CREATE TABLE flight (
  src TEXT
, dest TEXT
, stt TIMESTAMP
, endt TIMESTAMP);

INSERT INTO flight VALUES 
('SIN', 'DAC', '2016-12-31 22:45:00', '2017-01-01 01:45:00'),
('DAC', 'SIN', '2017-01-01 16:30:00', '2017-01-01 21:30:00'),
('SIN', 'DAC', '2017-01-01 22:45:00', '2017-01-02 01:45:00'),
('DAC', 'DEL', '2017-01-01 10:00:00', '2017-01-01 11:30:00'),
('DEL', 'DAC', '2017-01-01 12:30:00', '2017-01-01 14:00:00'),
('DAC', 'CCU', '2017-01-01 10:30:00', '2017-01-01 11:15:00'),
('CCU', 'DAC', '2017-01-01 11:45:00', '2017-01-01 12:30:00'),
('SIN', 'DEL', '2017-01-01 11:00:00', '2017-01-01 15:00:00'),
('DEL', 'SIN', '2017-01-01 16:30:00', '2017-01-01 20:30:00'),
('CCU', 'DEL', '2017-01-01 12:00:00', '2017-01-01 12:45:00'),
('DEL', 'CCU', '2017-01-01 13:15:00', '2017-01-01 14:00:00');

递归 CTE 很难理解,所以我将逐个构建逻辑.

Recursive CTEs are difficult to grok, so I'll build up the logic piece by piece.

在递归 CTE 中,有两个部分.一个锚子查询 &递归子查询.先执行锚子查询,然后执行递归子查询,直到返回空结果集.棘手的部分(至少对我而言)是构建将执行从 1 个节点到下一个节点的遍历的连接.

In a recursive CTE there are two parts. An anchor subquery & an recursive subquery. The anchor subquery is executed first, and then the recursive subquery is executed until an empty result set is returned. The tricky part (for me at least) is constructing the join that will perform the traversal from 1 node to the next.

锚点和递归子查询使用UNION ALL(或有时UNION)合并并返回.

The anchor and recursive subqueries are merged using a UNION ALL (or sometimes UNION) and returned.

由于我们对航线感兴趣,一个简单的锚子查询开始是:

Since we are interested in flight routes, a simple anchor sub query to start with would be:

SELECT src, ARRAY[src] path, dest FROM flight

上面的查询有 3 列,分别是开始、路径和最后,在第二列中,我们将 src 字段从 TEXT 转换为 TEXT[].虽然这不是严格需要的,但它会大大简化剩余的步骤,因为数组很容易附加到.

The above query has 3 columns for start, path & end, in the 2nd column, we convert the src field from TEXT to TEXT[]. While this is not strictly needed, it will greatly simplify the remaining steps because arrays are easy to append to.

使用上面的锚查询,我们现在可以定义我们的递归 cte.

Using above anchor query, we can now define our recursive cte.

WITH RECURSIVE flight_paths (src, path, dest) AS (
SELECT
  src
, ARRAY[src]
, dest 
FROM flight
UNION ALL
SELECT
  fp.src
, fp.path || f.src -- appends another 'flight source'
, f.dest -- updates the dest to the new dest
FROM flight f
JOIN flight_paths fp ON f.src = fp.dest 
-- this is the join that links the tree nodes
WHERE NOT f.src = ANY(fp.path) 
-- this condition prevents infinite recursion by not visiting nodes that have already been visited.
) SELECT * FROM flight_paths
-- finally, we can select from the flight_paths recursive cte

以上返回 136 行和我的测试数据.前 15 行如下所示:

The above returns 136 rows with my test data. The first 15 rows are shown below:

src path        dest
SIN {SIN}       DAC
DAC {DAC}       SIN
SIN {SIN}       DAC
DAC {DAC}       DEL
DEL {DEL}       DAC
DAC {DAC}       CCU
CCU {CCU}       DAC
SIN {SIN}       DEL
DEL {DEL}       SIN
CCU {CCU}       DEL
DEL {DEL}       CCU
DEL {DEL,CCU}   DAC
DAC {DAC,CCU}   DAC
DEL {DEL,CCU}   DEL
DAC {DAC,CCU}   DEL
DEL {DEL,CCU}   DEL
DAC {DAC,CCU}   DEL

这部分,树遍历的设置,是编写递归 cte 中最难的部分.现在,遍历已经建立,我们可以修改上面的内容:

This part, the setting up of the tree traversal, is the hardest part of writing a recursive cte. Now, that the traversal is set up, we can modify the above so that:

  1. 我们从源头开始在目的地结束
  2. 到达目的地时停止迭代
  3. 只考虑转机 A-B &如果到达(A-B) <,则B-C 有效.出发(B-C)

对于这个例子,让 src := SIN &dest := CCU

for this example, let src := SIN & dest := CCU

WITH RECURSIVE flight_paths (src, path, dest, stt, endt) AS (
SELECT
  src
, ARRAY[src]
, dest 
, stt
, endt
FROM flight
UNION ALL
SELECT
  fp.src
, fp.path || f.src
, f.dest 
, fp.stt
, f.endt -- update endt to the new endt
FROM flight f
JOIN flight_paths fp ON f.src = fp.dest 
WHERE NOT f.src = ANY(fp.path) 
  AND NOT 'CCU' = ANY(fp.path) 
  -- (2) this new predicate stop iteration when the destination is reached
  AND f.stt > fp.endt
  -- (3) this new predicate only proceeds the iteration if the connecting flights are valid
) 
SELECT * 
FROM flight_paths
WHERE src = 'SIN' AND dest = 'CCU'
-- (1) specify the start & end

这给出了 2 条可能的航线,其中第一趟航班的起飞时间为列 stt &最后一班航班的到达时间 endt:

This give 2 possible routes with the departure time of the first flight as the column stt & the arrival time of the last flight as endt:

src path            dest    stt                 endt
SIN {SIN,DAC}       CCU     2016-12-31 22:45:00 2017-01-01 11:15:00
SIN {SIN,DAC,DEL}   CCU     2016-12-31 22:45:00 2017-01-01 14:00:00

现在可以很容易地优化结果集.例如,最终查询可以修改为仅返回中午之前到达目的地的航班:

It is now possible to refine the result set quite easily. For instance, the final query could be modified to only return flights that arrive at the destination before noon as:

SELECT * 
FROM flight_paths
WHERE src = 'SIN' AND dest = 'CCU' 
  AND endt::time < '12:00:00'::time

从这一点开始,添加计算飞行时间的列也相当容易.递归 cte 中的连接时间,然后过滤符合这两次谓词的航班.我希望我已经为您提供了足够的细节来生成这两个变体.

It is also fairly easy from this point on to add columns that calculate flight time & connection time in the recursive cte and then filter for flights that fit a predicate on those two times. I hope I've given enough details for you to generate these two variations.

您还可以通过过滤 path 数组上的长度来限制连接数,尽管在达到最大连接数后停止递归 cte 中的迭代可能更有效.

You can also cap the number of connections by filtering on the length on the path array, although it would probably be more efficient to stop the iteration in the recursive cte once the max connections is reached.

最后一点:为了简单起见,我使用了不包括最终目的地的路径,例如{SIN, DAC, DEL} 而不是航班顺序 {SIN-DAC, DAC-DEL, DEL-CCU}(或中途停留 {DAC, DEL}),但我们可以很容易地得到这些表示,如下所示:

One final note: to keep things simple before, I worked with the paths excluding the final destination, e.g. {SIN, DAC, DEL} instead of the sequence of flights {SIN-DAC, DAC-DEL, DEL-CCU} (or stop overs {DAC, DEL}), but we can get those representations fairly easily as shown below:

WITH RECURSIVE flight_paths (src, flights, path, dest, stt, endt) AS (
SELECT
  src
, ARRAY[src || '-' || dest]
, ARRAY[src]
, dest 
, stt
, endt
FROM flight
UNION ALL
SELECT
  fp.src
, fp.flights || (f.src || '-' || f.dest)
, fp.path || f.src
, f.dest
, fp.stt
, f.endt
FROM flight f
JOIN flight_paths fp ON f.src = fp.dest 
WHERE NOT f.src = ANY(fp.path) 
  AND NOT 'CCU' = ANY(fp.path) 
  AND f.stt > fp.endt
) 
SELECT flights, stt, endt, path[2:] stopovers
FROM flight_paths
WHERE src = 'SIN' AND dest = 'CCU'

这篇关于Sql查询以获取连接航班的src到dest路径的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆