在Google BigQuery的不同栏中查询关键字值 [英] Query key value in different columns from Google BigQuery
问题描述
我使用与Google BigQuery关联的Firebase Analytics收集分析数据。
我在BigQuery中获得以下数据(不必要的列/行被忽略,数据集看起来像类似于 https:// bigquery。 cloud.google.com/table/firebase-analytics-sample-data:ios_dataset.app_events_20160607?tab=preview ):
| event_dim.name | event_dim.params.key | event_dim.params.value.string_value |
| ---------------- | ---------------------- | ----- -------------------------------- |
| read_post | post_id | p_100 |
| | group_id | g_1 |
| | user_id | u_1 |
| open_group | post_id | p_200 |
| | group_id | g_2 |
| | user_id | u_1 |
| open_group | post_id | p_300 |
| | group_id | g_1 |
| | user_id | u_3 |
我想查询以下数据:
- 活动名称
- 用户名称
- 组ID
我试过了以下查询:
pre $ SELECT
event_dim.name,
FIRST(IF(event_dim.params.key =user_id,event_dim.params.value.string_value,NULL))WITHIN RECORD USER_ID,
FIRST(IF(event_dim.params.key =group_id, event_dim.params.value.string_value,NULL))WITHIN RECORD group_id
FROM
[xxx:xxx_IOS.app_events_20161102]
LIMIT
1000
上述查询的问题是聚集函数 FIRST
会给出错误的结果,因为使用 WITHIN
修饰符的 SELECT
语句将返回结果列表。 FIRST
函数只会在第一行的情况下给出正确的结果。
使用标准SQL (取消选中使用旧版SQL你可以这样做:
SELECT
event_dim.name,
(SELECT值.string_value FROM UNNEST(params)
WHERE key ='user_id')AS user_id,
(SELECT value.string_value FROM UNNEST(params)
WHERE key ='group_id')AS group_id
FROM`firebase-analytics-sample-data.ios_dataset.app_events_20160607`,
UNNEST(event_dim)AS event_dim
LIMIT 1000;
如果您只希望同时具有'user_id'
和'group_id'
,您可以过滤掉NULL值:
SELECT
event_dim.name,
(SELECT value.string_value FROM UNNEST(params)
WHERE key ='user_id')AS user_id,
(SELECT value.string_value FROM UNNEST(params)
WHERE key ='group_id')AS group_id
FROM`firebase-analytics-sample-data.ios_dataset.app_events_20160607`,
UNNEST( event_dim)AS event_dim
)
WHERE user_id IS NOT NULL AND GROUP_ID IS NOT NULL
LIMIT 1000;
I gather analytics with Firebase Analytics which I linked to Google BigQuery.
I have the following data in BigQuery (unnecessary columns/rows are left off, the dataset looks similar to https://bigquery.cloud.google.com/table/firebase-analytics-sample-data:ios_dataset.app_events_20160607?tab=preview):
| event_dim.name | event_dim.params.key | event_dim.params.value.string_value |
|----------------|----------------------|-------------------------------------|
| read_post | post_id | p_100 |
| | group_id | g_1 |
| | user_id | u_1 |
| open_group | post_id | p_200 |
| | group_id | g_2 |
| | user_id | u_1 |
| open_group | post_id | p_300 |
| | group_id | g_1 |
| | user_id | u_3 |
I want to query the following data:
- event name
- user id
- group id
I tried the following query:
SELECT
event_dim.name,
FIRST(IF(event_dim.params.key = "user_id", event_dim.params.value.string_value, NULL)) WITHIN RECORD user_id,
FIRST(IF(event_dim.params.key = "group_id", event_dim.params.value.string_value, NULL)) WITHIN RECORD group_id
FROM
[xxx:xxx_IOS.app_events_20161102]
LIMIT
1000
The problem with the above query is that the aggregate function FIRST
will give the wrong result because the SELECT
statements with a WITHIN
modifier will return a list of results. The FIRST
function will only give the correct result in case of the first row.
Using standard SQL (uncheck "Use Legacy SQL" under "Show Options") you can do:
SELECT
event_dim.name,
(SELECT value.string_value FROM UNNEST(params)
WHERE key = 'user_id') AS user_id,
(SELECT value.string_value FROM UNNEST(params)
WHERE key = 'group_id') AS group_id
FROM `firebase-analytics-sample-data.ios_dataset.app_events_20160607`,
UNNEST(event_dim) AS event_dim
LIMIT 1000;
If you only want rows that have both 'user_id'
and 'group_id'
, you can filter out the NULL values:
SELECT * FROM (
SELECT
event_dim.name,
(SELECT value.string_value FROM UNNEST(params)
WHERE key = 'user_id') AS user_id,
(SELECT value.string_value FROM UNNEST(params)
WHERE key = 'group_id') AS group_id
FROM `firebase-analytics-sample-data.ios_dataset.app_events_20160607`,
UNNEST(event_dim) AS event_dim
)
WHERE user_id IS NOT NULL AND group_id IS NOT NULL
LIMIT 1000;
这篇关于在Google BigQuery的不同栏中查询关键字值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!