使用一个表的值查询BigQuery中的另一张表 [英] Using one table's values to query another table in BigQuery
问题描述
BigQuery中有两个表,一个是带有模式的trip_times表:
I have two tables in BigQuery, one is a journey_times table with a schema:
- journey_id(int)
- vehicle_id(字符串)
- 开始(时间戳yyyy-mm-dd hh:mm:ss)
- 完成(时间戳yyyy-mm-dd hh:mm:ss)
样本数据:
-trip_id¦车辆¦开始¦完成-1¦ car1¦ 2017-12-05 01:33:44 UTC¦ 2017-12-05 01:53:14 UTC-2¦ car2¦ 2017-12-05 03:04:18 UTC¦ 2017-12-05 03:28:49 UTC
,另一个是距离表,其模式为:
and the other is a distance table with a schema of:
- vehicle_id(字符串)
- 时间戳(时间戳yyyy-mm-dd hh:mm:ss)
- 值(浮动)
样本数据:
-车辆¦时间戳¦值-car3¦ 2016-08-30 17:36:52 UTC¦ 0.01635375-car3¦ 2016-08-30 17:36:53 UTC¦ 0.02862375
我想做的是查询距离表,并在trip_times表中找到每个行驶行的最大值和最小值(具有相同的车辆ID,并且在开始和结束时间戳之间),以得到一个表可以加入到旅程表中,如下所示:
What I want to do is query the distance table and find the maximum and minimum value for each of the journey rows in the journey_times table (with the same vehicle ID and between the start and finish timestamps) to end up with a table that can be joined onto the journey table and look like:
- journey_id
- max_distance_value
- 最小距离值
您将如何使用BigQuery标准或旧版SQL编写代码?
我在下面的尝试没有产生任何结果.`
How would you write this in in BigQuery standard or legacy SQL?
My attempt below doesn't produce any results.
`
WITH
distance_table AS (
SELECT
vehicle,
timestamp,
value
FROM
'project.trip_distance' ),
journey_table AS (
SELECT
journey_id,
vehicle,
start,
finish
FROM
'project.journey_times')
SELECT
MIN(distance_table.value)
FROM
distance_table JOIN journey_table
using (vehicle)
WHERE
distance_table.vehicle = journey_table.vehicle
AND distance_table.timestamp BETWEEN journey_table.start
AND journey_table.finish
推荐答案
如果我正确理解,这是 join
和 group by
:
If I understand correctly, this is a join
and group by
:
select j.journey_id, min(value), max(value)
from journey_times jt join
vehicles v
on jt.vehicle_id = v.vehicle_id and
v.timestamp between jt.start and jt.finish
group by j.journey_id;
这篇关于使用一个表的值查询BigQuery中的另一张表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!