大查询-将数组/json对象转置为列 [英] Big Query - Transpose array/json objects into columns

查看:51
本文介绍了大查询-将数组/json对象转置为列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这个问题是这两个问题的延续:

  1. 注意:上面脚本的 EXECUTE IMMEDIATE 部分与上一篇文章完全相同-更改仅在于将原始数据准备到临时表 data 中,而不是使用它在立即执行

    This question is a continuation of these two:

    1. Big Query - Transpose arrays into colums
    2. Big Query - Transpose Specific fields into Columns

    We have a table in Big Query like below.

    Input table:

     Name | Question  | Answer
     -----+-----------+-------
     Bob  | Interest  | ["a"]     
     Sue  | Interest  | ["a", "b"]
     Joe  | Interest  | ["b"]
     Joe  | Gender    | Male
     Bob  | Gender    | Female
     Sue  | DOB       | 2020-10-17
     Bob  | Others    | { "country" : "es", "language" : "ca"}
    

    Note: All the values in the Answer column are stringified values and the Arrays / JSON objects are dynamic.

    We want to convert the above table to the below format to make it BI/Visualisation friendly.

    Desired table:

     +-------------------------------------------------------------+
     | Name | a | b | c | Gender | DOB        | country | language |
     +-------------------------------------------------------------+
     | Bob  | 1 | 0 | 0 | Female | 2020-10-17 |   es    |   ca     |
     | Sue  | 1 | 1 | 0 |   -    |     -      |   -     |   -      |
     | Joe  | 0 | 1 | 0 |  Male  |     -      |   -     |   -      |
     +-------------------------------------------------------------+
    

    解决方案

    Below is for BigQuery Standard SQL

    #standardSQL
    create temp table data as
    select name, question, value as answer 
    from `project.dataset.table`, 
    unnest(split(translate(answer, '[]" ', ''))) value
    where question = 'Interest'
    union all
    select name, question, answer 
    from `project.dataset.table`
    where not question in ('Interest', 'Others')
    union all
    select name, 
      split(value, ':')[offset(0)] as question, 
      split(value, ':')[offset(1)] as answer 
    from `project.dataset.table`, 
    unnest(split(translate(answer, '{}" ', ''))) value
    where question = 'Others';
    
    EXECUTE IMMEDIATE (
      SELECT """
        SELECT name, """ || STRING_AGG("""MAX(IF(answer = '""" || value || """', 1, 0)) AS """ || value, ', ')   
    FROM (
      SELECT DISTINCT answer value FROM data
      WHERE question = 'Interest' ORDER BY value
    )) || (
      SELECT ", " || STRING_AGG("""MAX(IF(question = '""" || value || """', answer, '-')) AS """ || value, ', ')   
    FROM (
        SELECT DISTINCT question value FROM data
        WHERE question != 'Interest' ORDER BY value
    )) || """  
      FROM data 
      GROUP BY name
      """;     
    

    if to apply to sample data from your question

    with `project.dataset.table` AS (
      select 'Bob' name, 'Interest' question, '["a"]' answer union all
      select 'Sue', 'Interest', '["a", "b"]' union all
      select 'Joe', 'Interest', '["b"]' union all
      select 'Joe', 'Gender', 'Male' union all
      select 'Bob', 'Gender', 'Female' union all
      select 'Sue', 'DOB', '2020-10-17' union all
      select 'Bob', 'Others', '{ "country" : "es", "language" : "ca"}' 
    )    
    

    the output is

    Note: EXECUTE IMMEDIATE part of above script is exactly the same as in previous post - the change is only in preparing original data into temp table data and than using it in EXECUTE IMMEDIATE

    这篇关于大查询-将数组/json对象转置为列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆