BigQuery连接了三个表 [英] BigQuery join of three tables

查看：91 发布时间：2018/5/7 17:31:25 sql google-bigquery

本文介绍了BigQuery连接了三个表的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我试图在BigQuery中加入三张表格;表1具有一个事件的记录（即每行是一条记录），表2具有第二个事件的记录，表3具有类别名称

我想根据类别和设备平台生成一个表格1和表格2的最终表格。但是，每次运行时，都会收到一个错误消息，指出 joined.t3.category不是联接中的任何表的字段。

以下是我当前的代码：
Select count（distinct joined.t1.Id）as t1_events，count（distinct t2.Id） as t2_events，joined.t1.Origin as platform，joined.t3.category as category $ b from （ SELECT Id ， Origin， CatId FROM [testing.table_1] as t1 JOIN（SELECT类别，类别ID FROM [testing.table_3]）as t3 on t1.CatId = t3.CategoryID ）AS加入 JOIN（ SELECT b， CategoryId FROM [testing.table_2]）as t2 ON（joined.t1.CatId = t2.CategoryId）按平台，类别分组;
作为参考，下面是表1和表2之间的简单连接，它们完美地工作：
Select count（distinct t1.Id）as t1_event，count（distinct t2.Id）as t2_events，t1.Origin as platform $来自testing.table_1的b $ b作为t1 在t1.CatId = t2.CategoryId 上JOIN testing.table_2作为t2 按平台分组;

解决方案
简单的解决方法是添加 category 字段在第一个内部 SELECT 中 - 否则对最外层 SELECT 不可见 - - 因此错误！这是问题！

另外，在BigQuery Legacy SQL中，您可以使用 EXACT_COUNT_DISTINCT ，否则您会得到统计学近似值 - 请参阅 COUNT（[DISTINCT ]）

因此，对于Legacy SQL，您的查询可能如下所示：

SELECT EXACT_COUNT_DISTINCT（joined.t1.Id）AS t1_events， EXACT_COUNT_DISTINCT（t2.Id）AS t2_events， joined.t1.Origin AS平台 joined.t3.category AS类别 FROM（ SELECT Id，Origin，CatId，类别 FROM [testing.table_1] AS t1 JOIN（SELECT类别，CategoryID FROM [testing.table_3]）AS t3 ON t1.CatId = t3.CategoryID ）AS加入 JOIN（SELECT Id，CategoryId FROM [testing.table_2]）AS t2 ON joined.t1.CatId = t2.CategoryId GROUP BY平台类别

此外，我觉得你可以进一步简化它（假设没有不明确的字段）

SELECT EXACT_COUNT_DISTINCT（joined.t1.Id）AS t1_events， EXACT_COUNT_DISTINCT（t2.Id）AS t2_events， joined.t1.Origin AS平台， joined.t3.category AS类别 FROM（ SELECT Id，Origin，CatId，类别 FROM [testing.table_1] AS t1 JOIN [testing.table_3] AS t3 ON t1.CatId = t3.CategoryID ）AS加入 JOIN [testing.table_2] AS t2 ON joined.t1.CatId = t2.CategoryId GROUP BY平台，类别
当然，您需要做同样的事情如果您使用标准SQL版本（如Elliott所建议的那样：

SELECT COUNT（DISTINCT joined。 t1.Id）AS t1_events， COUNT（DISTINCT t2.Id）AS t2_events， joined.t1.Origin AS平台， joined.t3.category AS类别 FROM（ SELEC T Id，Origin，CatId，类别 FROM`testing.table_1`作为t1 JOIN`testing.table_3`作为t3 ON t1.CatId = t3.CategoryID ）AS加入 JOIN`testing.table_2`作为t2 ON joined.t1.CatId = t2.CategoryId GROUP BY平台，类别

I am trying to join three tables in BigQuery; table 1 has records of one event (i.e. each row is one record), table 2 has records of a second event, and table 3 has category names.

I want to produce a final table that has counts for table 1 and table 2 by category and device platform. However, every time I run this I get an error that says joined.t3.category is not a field of either table in the join.

Here's my current code:
Select count(distinct joined.t1.Id) as t1_events, count(distinct t2.Id) as t2_events, joined.t1.Origin as platform, joined.t3.category as category from ( SELECT Id, Origin, CatId FROM [testing.table_1] as t1 JOIN (SELECT category, CategoryID FROM [testing.table_3]) as t3 on t1.CatId = t3.CategoryID ) AS joined JOIN (SELECT Id, CategoryId FROM [testing.table_2]) as t2 ON (joined.t1.CatId = t2.CategoryId) Group by platform,category;
For reference, here's a simpler join between tables 1 and 2 that works perfectly:
Select count(distinct t1.Id) as t1_event, count(distinct t2.Id) as t2_events, t1.Origin as platform from testing.table_1 as t1 JOIN testing.table_2 as t2 on t1.CatId = t2.CategoryId Group by platform;

解决方案
The simple fix is to add category field in first inner SELECT - otherwise it is not visible to outermost SELECT - thus the error! That was the issue!

Also, in BigQuery Legacy SQL you can use EXACT_COUNT_DISTINCT otherwise you get statistical approximation - see more in COUNT([DISTINCT])

So, for Legacy SQL your query can look like:
SELECT EXACT_COUNT_DISTINCT(joined.t1.Id) AS t1_events, EXACT_COUNT_DISTINCT(t2.Id) AS t2_events, joined.t1.Origin AS platform, joined.t3.category AS category FROM ( SELECT Id, Origin, CatId, category FROM [testing.table_1] AS t1 JOIN (SELECT category, CategoryID FROM [testing.table_3]) AS t3 ON t1.CatId = t3.CategoryID ) AS joined JOIN (SELECT Id, CategoryId FROM [testing.table_2]) AS t2 ON joined.t1.CatId = t2.CategoryId GROUP BY platform, category
Moreover, I feel like you can simplify it further (assuming there will be no ambiguous fields)
SELECT EXACT_COUNT_DISTINCT(joined.t1.Id) AS t1_events, EXACT_COUNT_DISTINCT(t2.Id) AS t2_events, joined.t1.Origin AS platform, joined.t3.category AS category FROM ( SELECT Id, Origin, CatId, category FROM [testing.table_1] AS t1 JOIN [testing.table_3] AS t3 ON t1.CatId = t3.CategoryID ) AS joined JOIN [testing.table_2] AS t2 ON joined.t1.CatId = t2.CategoryId GROUP BY platform, category
Of course you will need to do same fix if you will use Standard SQL version of it (as Elliott has suggested:
SELECT COUNT(DISTINCT joined.t1.Id) AS t1_events, COUNT(DISTINCT t2.Id) AS t2_events, joined.t1.Origin AS platform, joined.t3.category AS category FROM ( SELECT Id, Origin, CatId, category FROM `testing.table_1` AS t1 JOIN `testing.table_3` AS t3 ON t1.CatId = t3.CategoryID ) AS joined JOIN `testing.table_2` AS t2 ON joined.t1.CatId = t2.CategoryId GROUP BY platform, category

这篇关于BigQuery连接了三个表的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

BigQuery连接了三个表 [英] BigQuery join of three tables

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

BigQuery连接了三个表 [英] BigQuery join of three tables

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭