BigQuery SQL查询中的动态列名称 [英] Dynamic Column Names in BigQuery SQL Query

查看:77
本文介绍了BigQuery SQL查询中的动态列名称的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个BigQuery表,其中的每一行都是一个国家/地区的用户的访问.模式是这样的:

I have a BigQuery table in which every row is a visit of a user in a country. The schema is something like this:

UserID   |   Place   |   StartDate   |   EndDate   | etc ...
---------------------------------------------------------------
134      |  Paris    |   234687432   |   23648949  | etc ...
153      |  Bangkok  |   289374897   |   2348709   | etc ...
134      |  Paris    |   9287324892  |   3435438   | etc ...

位置"列的值最多只能有几十个选项,但我不知道它们是什么.

The values of the "Place" columns can be no more than tens of options, but I don't know them all in advance.

我想查询该表,以便在结果表中将列命名为Place列的所有可能性,并且值是该位置每个用户的总访问次数.最终结果应如下所示:

I want to query this table so that in the resulted table the columns are named as all the possibilities of the Place column, and the values are the total number of visits per user in this place. The end result should look like this:

UserID | Paris | Bangkok | Rome | London | Rivendell | Alderaan 
----------------------------------------------------------------
134    |  2    |  0      |  0   |  0     |  0        |  0 
153    |  0    |  1      |  0   |  0     |  0        |  0

我想我可以通过 SELECT DISTINCT 选择"Place"的所有可能值,但是如何实现结果表的这种结构?

I guess I can select all the possible values of "Place" with SELECT DISTINCT but how can I achieve this structure of result table?

谢谢

推荐答案

以下是BigQuery标准SQL

Below is for BigQuery Standard SQL

第1步-使用"place"字段的所有可能值动态组装适当的SQL语句

Step 1 - dynamically assemble proper SQL statement with all possible values of "place" field

#standardSQL
SELECT '''
SELECT UserID,''' || STRING_AGG(DISTINCT
  ' COUNTIF(Place = "' || Place || '") AS ' || REPLACE(Place, ' ', '_')
) || ''' FROM `project.dataset.table`
GROUP BY UserID
'''
FROM `project.dataset.table`

注意:您将获得一行输出,文本如下所示(已经分成多行以更好地阅读

Note: you will get one row output with the text like below (already split in multiple rows for better reading

SELECT UserID, 
COUNTIF(Place = "Paris") AS Paris, 
COUNTIF(Place = "Los Angeles") AS Los_Angeles 
FROM `project.dataset.table` 
GROUP BY UserID

注意;我用 Los Angeles 替换了 Bangkok ,所以您明白了为什么用下划线替换可能的空格很重要

Note; I replaced Bangkok with Los Angeles so you see why it is important to replace possible spaces with underscores

第2步-只需复制第1步的输出文本并直接运行

Step 2 - just copy output text of Step 1 and simply run it

很明显,您可以使用任何选择的客户端来自动化上述两个步骤

Obviously you can automate above two steps using any client of your choice

这篇关于BigQuery SQL查询中的动态列名称的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆