在bigquery中根据一天中的时间将值分为两列 [英] Split a value into two columns based on time of day in bigquery

查看:61
本文介绍了在bigquery中根据一天中的时间将值分为两列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

每小时记录一次设备的能耗:

The energy usage of a device is logged hourly:

+--------------+-----------+-----------------------+
| energy_usage | device_id |  timestamp            |
+--------------+-----------+-----------------------+
| 10           | 1         |  2019-02-12T01:00:00  |
| 16           | 2         |  2019-02-12T01:00:00  |
| 26           | 1         |  2019-03-12T02:00:00  |
| 24           | 2         |  2019-03-12T02:00:00  |
+--------------+-----------+-----------------------+

我的目标是:

  1. 创建两列,一列用于 energy_usage_day (上午8点至晚上8点),另一列用于 energy_usage_day (晚上8点至上午8点)
  2. 创建每月汇总,按device_id分组并汇总能源使用量
  3. 删除每月能耗低于50的行
  1. Create two columns, one for energy_usage_day (8am-8pm) and another for energy_usage_night (8pm-8am)
  2. Create a monthly aggregate, group by device_id and sum up the energy usage
  3. Remove rows with a monthly energy usage lower 50

所以结果可能看起来像这样:

So the result might look like this:

+--------------+------------------+--------------------+-----------+---------+------+
| energy_usage | energy_usage_day | energy_usage_night | device_id |  month  | year |
+--------------+------------------+--------------------+-----------+---------+------+
| 80           | 30               | 50                 | 1         | 2       | 2019 |
| 130          | 60               | 70                 | 2         | 3       | 2019 |
+--------------+------------------+--------------------+-----------+---------+------+

在步骤2中,我将使用

SUM(energy_usage) OVER (PARTITION BY device_id ORDER BY FORMAT_TIMESTAMP("%m", TIMESTAMP(timestamp))) 

但是我不确定如何完成步骤1.甚至在bigquery中有可能吗?

However I am not sure how to accomplish step 1. Is it even possible in bigquery?

推荐答案

使用 IF ,无需使用 OVER :

SELECT SUM(energy_usage) energy_usage
  , SUM(IF(EXTRACT(HOUR FROM timestamp) BETWEEN 8 AND 19, energy_usage, 0)) energy_usage_day
  , SUM(IF(EXTRACT(HOUR FROM timestamp) NOT BETWEEN 8 AND 19, energy_usage, 0)) energy_usage_night
  , device_id
  , EXTRACT(MONTH FROM timestamp) month, EXTRACT(YEAR FROM timestamp) year
FROM `data`
GROUP BY device_id, month, year

这篇关于在bigquery中根据一天中的时间将值分为两列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆