使用一个表的值查询BigQuery中的另一张表 [英] Using one table's values to query another table in BigQuery

查看:34
本文介绍了使用一个表的值查询BigQuery中的另一张表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

BigQuery中有两个表,一个是带有模式的trip_times表:

I have two tables in BigQuery, one is a journey_times table with a schema:

  1. journey_id(int)
  2. vehicle_id(字符串)
  3. 开始(时间戳yyyy-mm-dd hh:mm:ss)
  4. 完成(时间戳yyyy-mm-dd hh:mm:ss)

样本数据:

-trip_id¦车辆¦开始¦完成-1¦ car1¦ 2017-12-05 01:33:44 UTC¦ 2017-12-05 01:53:14 UTC-2¦ car2¦ 2017-12-05 03:04:18 UTC¦ 2017-12-05 03:28:49 UTC

,另一个是距离表,其模式为:

and the other is a distance table with a schema of:

  1. vehicle_id(字符串)
  2. 时间戳(时间戳yyyy-mm-dd hh:mm:ss)
  3. 值(浮动)

样本数据:

-车辆¦时间戳¦值-car3¦ 2016-08-30 17:36:52 UTC¦ 0.01635375-car3¦ 2016-08-30 17:36:53 UTC¦ 0.02862375

我想做的是查询距离表,并在trip_times表中找到每个行驶行的最大值和最小值(具有相同的车辆ID,并且在开始和结束时间戳之间),以得到一个表可以加入到旅程表中,如下所示:

What I want to do is query the distance table and find the maximum and minimum value for each of the journey rows in the journey_times table (with the same vehicle ID and between the start and finish timestamps) to end up with a table that can be joined onto the journey table and look like:

  1. journey_id
  2. max_distance_value
  3. 最小距离值

您将如何使用BigQuery标准或旧版SQL编写代码?
我在下面的尝试没有产生任何结果.`

How would you write this in in BigQuery standard or legacy SQL?
My attempt below doesn't produce any results. `

WITH
  distance_table AS (
  SELECT
    vehicle,
    timestamp,
    value
  FROM
    'project.trip_distance' ),
  journey_table AS (
  SELECT
    journey_id,
    vehicle,
    start,
    finish        
  FROM
    'project.journey_times')
SELECT
  MIN(distance_table.value)
FROM
  distance_table JOIN journey_table
  using (vehicle)
WHERE
  distance_table.vehicle = journey_table.vehicle
  AND distance_table.timestamp BETWEEN journey_table.start
  AND journey_table.finish

推荐答案

如果我正确理解,这是 join group by :

If I understand correctly, this is a join and group by:

select j.journey_id, min(value), max(value)
from journey_times jt join
     vehicles v
     on jt.vehicle_id = v.vehicle_id and
        v.timestamp between jt.start and jt.finish
group by j.journey_id;

这篇关于使用一个表的值查询BigQuery中的另一张表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆