如何在BigQuery中旋转数据表 [英] How to Pivot table in BigQuery
问题描述
我正在使用Google Big Query,并且试图从公共样本数据集中得出重要的结果.
I am using Google Big Query, and I am trying to get a pivoted result out from public sample data set.
对现有表的简单查询是:
A simple query to an existing table is:
SELECT *
FROM publicdata:samples.shakespeare
LIMIT 10;
此查询返回以下结果集.
This query returns following result set.
现在我要做的是,以这种方式从表中获取结果:如果该单词很勇敢,请选择勇敢"作为column_1,如果该单词已有人参与,请选择"ATTENDED"作为column_2,然后进行汇总这些2的字数.
Now what I am trying to do is, get the results from the table in such way that if the word is brave, select "BRAVE" as column_1 and if the word is attended, select "ATTENDED" as column_2, and aggregate the word count for these 2.
这是我正在使用的查询.
Here is the query that I am using.
SELECT
(CASE WHEN word = 'brave' THEN 'BRAVE' ELSE '' END) AS column_1,
(CASE WHEN word = 'attended' THEN 'ATTENDED' ELSE '' END) AS column_2,
SUM (word_count)
FROM publicdata:samples.shakespeare
WHERE (word = 'brave' OR word = 'attended')
GROUP BY column_1, column_2
LIMIT 10;
但是,此查询返回数据
我要找的是
我知道此数据集的枢纽没有意义.但我仅以此为例来说明问题.如果您能为我提供一些指导,那就太好了.
I know this pivot for this data set does not make sense. But I am just taking this as an example to explain the problem. It will be great if you can put in some directions for me.
我还提到了如何使用BigQuery?,似乎也有我在这里提到的相同问题.
EDITED: I also referred to How to simulate a pivot table with BigQuery? and it seems it also has the same issue I mentioned here.
推荐答案
更新2020:
只需调用fhoffa.x.pivot()
,如这篇文章中所述:
Just call fhoffa.x.pivot()
, as detailed in this post:
对于2019年的示例,例如:
For the 2019 example, for example:
CREATE OR REPLACE VIEW `fh-bigquery.temp.a` AS (
SELECT * EXCEPT(SensorName), REGEXP_REPLACE(SensorName, r'.*/', '') SensorName
FROM `data-sensing-lab.io_sensor_data.moscone_io13`
);
CALL fhoffa.x.pivot(
'fh-bigquery.temp.a'
, 'fh-bigquery.temp.delete_pivotted' # destination table
, ['MoteName', 'TIMESTAMP_TRUNC(Timestamp, HOUR) AS hour'] # row_ids
, 'SensorName' # pivot_col_name
, 'Data' # pivot_col_value
, 8 # max_columns
, 'AVG' # aggregation
, 'LIMIT 10' # optional_limit
);
更新2019:
由于这是一个很普遍的问题,所以让我更新为#standardSQL和更通用的透视案例.在这种情况下,我们有多行,并且每个传感器都查看不同类型的属性.要对其进行调整,我们将执行以下操作:
Since this is a popular question, let me update to #standardSQL and a more general case of pivoting. In this case we have multiple rows, and each sensor looks at a different type of property. To pivot it, we would do something like:
#standardSQL
SELECT MoteName
, TIMESTAMP_TRUNC(Timestamp, hour) hour
, AVG(IF(SensorName LIKE '%altitude', Data, null)) altitude
, AVG(IF(SensorName LIKE '%light', Data, null)) light
, AVG(IF(SensorName LIKE '%mic', Data, null)) mic
, AVG(IF(SensorName LIKE '%temperature', Data, null)) temperature
FROM `data-sensing-lab.io_sensor_data.moscone_io13`
WHERE MoteName = 'XBee_40670F5F'
GROUP BY 1, 2
作为AVG()
的替代方法,您可以尝试MAX()
,ANY_VALUE()
等.
As an alternative to AVG()
you can try MAX()
, ANY_VALUE()
, etc.
以前:
我不确定您要做什么,但是:
I'm not sure what you are trying to do, but:
SELECT NTH(1, words) WITHIN RECORD column_1, NTH(2, words) WITHIN RECORD column_2, f0_
FROM (
SELECT NEST(word) words, SUM(c)
FROM (
SELECT word, SUM(word_count) c
FROM publicdata:samples.shakespeare
WHERE word in ('brave', 'attended')
GROUP BY 1
)
)
更新:结果相同,查询更简单:
UPDATE: Same results, simpler query:
SELECT NTH(1, word) column_1, NTH(2, word) column_2, SUM(c)
FROM (
SELECT word, SUM(word_count) c
FROM publicdata:samples.shakespeare
WHERE word in ('brave', 'attended')
GROUP BY 1
)
这篇关于如何在BigQuery中旋转数据表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!