大约在22:00左右的时间间隔分割2行 [英] Split TIME interval around 22:00 time in 2 rows

查看:83
本文介绍了大约在22:00左右的时间间隔分割2行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有这张表

  + ----- + ----- + ------ ------ + ---------- + --- + 
|行| id | start_time | end_time | |
+ ----- + ----- + ------------ + ---------- + --- +
| 1 | foo | 18:00:00 | 22:00:00 | |
| 2 | bar | 19:00:00 | 23:00:00 | |
| 3 | baz | 08:00:00 | 11:00:00 | |
| 4 | qux | 05:30:00 | 07:30:00 | |
+ ----- + ----- + ------------ + ---------- + --- +

可以用这个生成

  WITH TABLE AS(
SELECTfooAS id,TIME(18,0,0)AS start_time,TIME(22,0,0)AS end_time
UNION ALL
SELECTbar,TIME(19,0,0),TIME(23,0,0)
UNION ALL
SELECTbaz,TIME(08,0,0),TIME(11, 0,0)
UNION ALL
SELECTqux,TIME(05,30,0),TIME(07,30,0)


SELECT * FROM TABLE

如果22:00小时在时间间隔内,则该行需要被分割对于给定的输入,输出应该是:

  + ----- + ----- + ------------ + ---------- + --- + 
|行| id | start_time | end_time | |
+ ----- + ----- + ------------ + ---------- + --- +
| 1 | foo | 18:00:00 | 22:00:00 | |
| 2 | bar | 19:00:00 | 22:00:00 | |
| 3 | bar | 22:00:00 | 23:00:00 | |
| 4 | baz | 08:00:00 | 11:00:00 | |
| 5 | qux | 05:30:00 | 06:00:00 | |
| 6 | qux | 06:00:00 | 07:30:00 | |
+ ----- + ----- + ------------ + ---------- + --- +

请注意,在分割行的开始和结束处均使用22:00小时。



另外,如果小时为06:00,则需要复制相同的确切行为



这种行为可以直接使用BigQuery吗?下面是针对BigQuery标准SQL的内容,它提供了非常通用的解决方案,您可以根据需要使用任意数量的断点他们在中断表达式




$ b

  #standardSQL 
WITH`project.dataset.table` AS(
SELECTfooAS id,TIME(18,0,0)as start_time,TIME(22,0 ,0)AS end_time UNION ALL
SELECTbar,TIME(19,0,0),TIME(23,0,0)UNION ALL
SELECTbaz,TIME(08,0, 0),TIME(11,0,0)UNION ALL
SELECTqux,TIME(05,30,0),TIME(07,30,0)UNION ALL
SELECTxxx, TIME(05,45,0),TIME(23 ,30,0)
),打破AS(
SELECT break FROM UNNEST([TIME(6,0,0),TIME(10,0,0),TIME(22,0,0) ])break
),temp AS(
SELECT id,start_time,end_time,
ARRAY_AGG(break ORDER BY break)break
FROM`project.dataset.table` CROSS JOIN break
WHERE break> start_time和break< end_time
GROUP BY id,start_time,end_time

SELECT ID,点start_time,
IFNULL(next_point,point)end_time $ b $ FROM(
SELECT id ,point,
LEAD(点)OVER(PARTITION BY ID ORDER BY点)next_point
FROM temp,UNNEST(ARRAY_CONCAT([start_time],bre​​ak,[end_time]))point


WHERE NOT next_point IS NULL
- ORDER BY id,point



<在上面的例子中,我设置了三个断点 - 06:00,10:00和22:00

和初始间隔为

行号start_time end_time
1 bar 19:00:00 23:00:00
2 baz 08: 00:00 11:00:00
3 foo 18:00:00 22:00:00
4 qux 05:30:00 07:30:00
5 xxx 05:45: 00 23:30:00

结果为:

 行标识start_time end_time 
1 bar 19:00:00 22:00:00
2 bar 22:00:00 23:00:00
3 baz 08:00:00 10:00:00
4 baz 10:00:00 11:00:00
5 qux 05:30:00 06:00:00
6 qux 06:00:00 07:30:00
7 xxx 05:45:00 06:00:00
8 xxx 06:00:00 10:00:00
9 xxx 10:00:00 22:00:00
10 xxx 22:00:00 23:30:00


I have this table

+-----+-----+------------+----------+---+
| Row | id  | start_time | end_time |   |
+-----+-----+------------+----------+---+
|   1 | foo | 18:00:00   | 22:00:00 |   |
|   2 | bar | 19:00:00   | 23:00:00 |   |
|   3 | baz | 08:00:00   | 11:00:00 |   |
|   4 | qux | 05:30:00   | 07:30:00 |   |
+-----+-----+------------+----------+---+

which can be generated with this

WITH TABLE AS (
  SELECT "foo" AS id, TIME(18,0,0) AS start_time, TIME(22,0,0) AS end_time
  UNION ALL
  SELECT "bar", TIME(19,0,0), TIME(23,0,0)
  UNION ALL
  SELECT "baz", TIME(08,0,0), TIME(11,0,0)
  UNION ALL
  SELECT "qux", TIME(05,30,0), TIME(07,30,0)
)

SELECT * FROM TABLE

If the 22:00 hour is inside the time interval, the row need to be splitted around the hour itself.

For the given input, the output should be:

+-----+-----+------------+----------+---+
| Row | id  | start_time | end_time |   |
+-----+-----+------------+----------+---+
|   1 | foo | 18:00:00   | 22:00:00 |   |
|   2 | bar | 19:00:00   | 22:00:00 |   |
|   3 | bar | 22:00:00   | 23:00:00 |   |
|   4 | baz | 08:00:00   | 11:00:00 |   |
|   5 | qux | 05:30:00   | 06:00:00 |   |
|   6 | qux | 06:00:00   | 07:30:00 |   |
+-----+-----+------------+----------+---+

Please note that 22:00 hour is used on both start and end of the splitted row.

Also, the same exact behaviour need to be replicated if the hour is 06:00

Can this behaviour be made using directly BigQuery?

解决方案

Below is for BigQuery Standard SQL and provides quite generic solution where you can have any number of break points as you need by "enlisting" them in breaks expressions

#standardSQL
WITH `project.dataset.table` AS (
  SELECT "foo" AS id, TIME(18,0,0) AS start_time, TIME(22,0,0) AS end_time UNION ALL
  SELECT "bar", TIME(19,0,0), TIME(23,0,0) UNION ALL
  SELECT "baz", TIME(08,0,0), TIME(11,0,0) UNION ALL
  SELECT "qux", TIME(05,30,0), TIME(07,30,0) UNION ALL
  SELECT "xxx", TIME(05,45,0), TIME(23,30,0)
), breaks AS (
  SELECT break FROM UNNEST([TIME(6,0,0), TIME(10,0,0), TIME(22,0,0)]) break
), temp AS (
  SELECT id, start_time, end_time, 
    ARRAY_AGG(break ORDER BY break) break
  FROM `project.dataset.table` CROSS JOIN breaks 
  WHERE break > start_time AND break < end_time 
  GROUP BY id, start_time, end_time
)
SELECT id, point start_time, 
  IFNULL(next_point, point) end_time
FROM (
  SELECT id, point, 
    LEAD(point) OVER(PARTITION BY id ORDER BY point) next_point
  FROM temp, UNNEST(ARRAY_CONCAT([start_time], break, [end_time])) point

)
WHERE NOT next_point IS NULL
-- ORDER BY id, point  

in above example I set three break points - 06:00, 10:00 and 22:00

and initial intervals as

Row id      start_time  end_time     
1   bar     19:00:00    23:00:00     
2   baz     08:00:00    11:00:00     
3   foo     18:00:00    22:00:00     
4   qux     05:30:00    07:30:00     
5   xxx     05:45:00    23:30:00    

and result is:

Row id      start_time  end_time     
1   bar     19:00:00    22:00:00     
2   bar     22:00:00    23:00:00     
3   baz     08:00:00    10:00:00     
4   baz     10:00:00    11:00:00     
5   qux     05:30:00    06:00:00     
6   qux     06:00:00    07:30:00     
7   xxx     05:45:00    06:00:00     
8   xxx     06:00:00    10:00:00     
9   xxx     10:00:00    22:00:00     
10  xxx     22:00:00    23:30:00     

这篇关于大约在22:00左右的时间间隔分割2行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆