在BigQuery(Pivot实现)中将行转换为列 [英] Transpose rows into columns in BigQuery (Pivot implementation)
问题描述
我想要生成一个新表,并使用BigQuery将所有具有键的键值对作为列名和值作为它们各自的值。 示例:
**钥匙** **价值**
channel_title Mahendra Guru
youtube_id ugEGMG4-MdA
channel_id UCiDKcjKocimAO1tV
examId 72975611-4a5e-11e5
postId 1189e340-b08f
channel_title Ab Live
youtube_id 3TNbtTwLY0U
channel_id UCODeKM_D6JLf8jJt
examId 72975611- 4a5e-11e5
postId 0c3e6590-afeb
我想将它转换为:
** channel_title youtube_id channel_id examId postId **
Mahendra Guru ugEGMG4-MdA UCiDKcjKocimAO1tV 72975611-4a5e-11e5 1189e340-b08f
Ab Live 3TNbtTwLY0U UCODeKM_D6 JLf8jJt 72975611-4a5e-11e5 0c3e6590-afeb
如何使用BigQuery做到这一点?
BigQuery不支持尚未支持的函数
您仍然可以使用下面的方法在BigQuery中执行此操作
但是,首先,除了输入数据中的两列以外,还必须有一列指定输入中需要组合为输出中的一行的行组
所以,我假设你的输入表(yourTable)看起来像下面那样
** id * * ** Key ** ** Value **
1 channel_title Mahendra Guru
1 youtube_id ugEGMG4-MdA
1 channel_id UCiDKcjKocimAO1tV
1 examId 72975611-4a5e-11e5
1 postId 1189e340-b08f
2 channel_title Ab Live
2 youtube_id 3TNbtTwLY0U
2 channel_id UCODeKM_D6JLf8jJt
2 examId 72975611-4a5e-11e5
2 postId 0c3e6590-afeb
所以,首先你应该运行下面的查询
SELECT'SELECT id,'+
GROUP_CONCAT_UNQUOTED(
'MAX(IF(key =''+ key +',value,NULL))as ['+ key +']'
)
+'FROM yourTable GROUP BY id ORDER BY id'
FROM(
SELECT键
FROM yourTable
GROUP BY键
ORDER BY键
)
以上查询的结果将是字符串(如果格式化)如下所示
<$ (IF(key =channel_id,value,NULL))AS [channel_id],
MAX(IF(key =channel_title,value,NULL))AS [channel_title],
MAX(IF(key =examId,value,NULL))AS [examId],
MAX postId,value,NULL))AS [postId],
MAX(IF(key =youtube_id,value,NULL))AS [youtube_id]
FROM yourTable
GROUP BY id
ORDER BY id
你现在应该复制上面的结果(注意:你不需要格式化它 - 我只是为了展示而做),并按正常查询
结果将如您所料
id channel_id channel_title examId postId youtube_id
1 UCiDKcjKocimAO1tV Mahendra Guru 72975611-4a5e-11e5 1189e340-b08f ugEGMG4-MdA
2 UCODeKM_D6JLf8jJt Ab Live 72975611-4a5e-11e5 0c3e6590-afeb 3TNbtTwLY0U
请注意:如果您可以自己构建正确的查询(如步骤2中所示)并且字段的数量小而恒定,或者如果它是一次交易。但第1步只是帮助你的一步,所以你可以随时创建它!
如果您有兴趣,可以在我的其他帖子中看到更多关于摆动的信息。
如何在Pivoting中缩放BigQuery?
请注意 - 每张表有10K列的限制 - 因此您受到10K组织的限制。
您也可以在下面看到简化示例如果上面的一个太复杂/详细):
如何创建虚拟变量Google BigQuery中数千个类别的列?
透视BigQuery中的重复字段 a>I want to generate a new table and place all key value pairs with keys as column names and values as their respective values using BigQuery.
Example:
**Key** **Value** channel_title Mahendra Guru youtube_id ugEGMG4-MdA channel_id UCiDKcjKocimAO1tV examId 72975611-4a5e-11e5 postId 1189e340-b08f channel_title Ab Live youtube_id 3TNbtTwLY0U channel_id UCODeKM_D6JLf8jJt examId 72975611-4a5e-11e5 postId 0c3e6590-afeb
I want to convert it to:
**channel_title youtube_id channel_id examId postId** Mahendra Guru ugEGMG4-MdA UCiDKcjKocimAO1tV 72975611-4a5e-11e5 1189e340-b08f Ab Live 3TNbtTwLY0U UCODeKM_D6JLf8jJt 72975611-4a5e-11e5 0c3e6590-afeb
How to do it using BigQuery?
解决方案BigQuery does not support yet pivoting functions
You still can do this in BigQuery using below approachBut first, in addition to two columns in input data you must have one more column that would specify groups of rows in input that needs to be combined into one row in output
So, I assume your input table (yourTable) looks like below
**id** **Key** **Value** 1 channel_title Mahendra Guru 1 youtube_id ugEGMG4-MdA 1 channel_id UCiDKcjKocimAO1tV 1 examId 72975611-4a5e-11e5 1 postId 1189e340-b08f 2 channel_title Ab Live 2 youtube_id 3TNbtTwLY0U 2 channel_id UCODeKM_D6JLf8jJt 2 examId 72975611-4a5e-11e5 2 postId 0c3e6590-afeb
So, first you should run below query
SELECT 'SELECT id, ' + GROUP_CONCAT_UNQUOTED( 'MAX(IF(key = "' + key + '", value, NULL)) as [' + key + ']' ) + ' FROM yourTable GROUP BY id ORDER BY id' FROM ( SELECT key FROM yourTable GROUP BY key ORDER BY key )
Result of above query will be string that (if to format) will look like below
SELECT id, MAX(IF(key = "channel_id", value, NULL)) AS [channel_id], MAX(IF(key = "channel_title", value, NULL)) AS [channel_title], MAX(IF(key = "examId", value, NULL)) AS [examId], MAX(IF(key = "postId", value, NULL)) AS [postId], MAX(IF(key = "youtube_id", value, NULL)) AS [youtube_id] FROM yourTable GROUP BY id ORDER BY id
you should now copy above result (note: you don't really need to format it - i did it for presenting only) and run it as normal query
Result will be as you would expected
id channel_id channel_title examId postId youtube_id 1 UCiDKcjKocimAO1tV Mahendra Guru 72975611-4a5e-11e5 1189e340-b08f ugEGMG4-MdA 2 UCODeKM_D6JLf8jJt Ab Live 72975611-4a5e-11e5 0c3e6590-afeb 3TNbtTwLY0U
Please note: you can skip Step 1 if you can construct proper query (as in step 2) by yourself and number of fields small and constant or if it is one time deal. But Step 1 just helper step that makes it for you, so you can create it fast any time!
If you are interested - you can see more about pivoting in my other posts.
How to scale Pivoting in BigQuery?
Please note – there is a limitation of 10K columns per table - so you are limited with 10K organizations.
You can also see below as simplified examples (if above one is too complex/verbose):
How to transpose rows to columns with large amount of the data in BigQuery/SQL?
How to create dummy variable columns for thousands of categories in Google BigQuery?
Pivot Repeated fields in BigQuery这篇关于在BigQuery(Pivot实现)中将行转换为列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!