大查询-将数组/json对象转置为列 [英] Big Query - Transpose array/json objects into columns
本文介绍了大查询-将数组/json对象转置为列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
这个问题是这两个问题的延续:
-
注意:上面脚本的
EXECUTE IMMEDIATE
部分与上一篇文章完全相同-更改仅在于将原始数据准备到临时表data
中,而不是使用它在立即执行
This question is a continuation of these two:
We have a table in Big Query like below.
Input table:
Name | Question | Answer -----+-----------+------- Bob | Interest | ["a"] Sue | Interest | ["a", "b"] Joe | Interest | ["b"] Joe | Gender | Male Bob | Gender | Female Sue | DOB | 2020-10-17 Bob | Others | { "country" : "es", "language" : "ca"}
Note: All the values in the Answer column are stringified values and the Arrays / JSON objects are dynamic.
We want to convert the above table to the below format to make it BI/Visualisation friendly.
Desired table:
+-------------------------------------------------------------+ | Name | a | b | c | Gender | DOB | country | language | +-------------------------------------------------------------+ | Bob | 1 | 0 | 0 | Female | 2020-10-17 | es | ca | | Sue | 1 | 1 | 0 | - | - | - | - | | Joe | 0 | 1 | 0 | Male | - | - | - | +-------------------------------------------------------------+
解决方案Below is for BigQuery Standard SQL
#standardSQL create temp table data as select name, question, value as answer from `project.dataset.table`, unnest(split(translate(answer, '[]" ', ''))) value where question = 'Interest' union all select name, question, answer from `project.dataset.table` where not question in ('Interest', 'Others') union all select name, split(value, ':')[offset(0)] as question, split(value, ':')[offset(1)] as answer from `project.dataset.table`, unnest(split(translate(answer, '{}" ', ''))) value where question = 'Others'; EXECUTE IMMEDIATE ( SELECT """ SELECT name, """ || STRING_AGG("""MAX(IF(answer = '""" || value || """', 1, 0)) AS """ || value, ', ') FROM ( SELECT DISTINCT answer value FROM data WHERE question = 'Interest' ORDER BY value )) || ( SELECT ", " || STRING_AGG("""MAX(IF(question = '""" || value || """', answer, '-')) AS """ || value, ', ') FROM ( SELECT DISTINCT question value FROM data WHERE question != 'Interest' ORDER BY value )) || """ FROM data GROUP BY name """;
if to apply to sample data from your question
with `project.dataset.table` AS ( select 'Bob' name, 'Interest' question, '["a"]' answer union all select 'Sue', 'Interest', '["a", "b"]' union all select 'Joe', 'Interest', '["b"]' union all select 'Joe', 'Gender', 'Male' union all select 'Bob', 'Gender', 'Female' union all select 'Sue', 'DOB', '2020-10-17' union all select 'Bob', 'Others', '{ "country" : "es", "language" : "ca"}' )
the output is
Note:
EXECUTE IMMEDIATE
part of above script is exactly the same as in previous post - the change is only in preparing original data into temp tabledata
and than using it inEXECUTE IMMEDIATE
这篇关于大查询-将数组/json对象转置为列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文