使用Google SQL或BIGQUERY取消透视多个列 [英] Unpivoting multiple columns using Google SQL or BIGQUERY

查看:64
本文介绍了使用Google SQL或BIGQUERY取消透视多个列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直想将Google表格中的表格取消透视,例如下面的表格(EAV)或日期,ldap,Shift:

I have been tring to unpivot a table from google sheets table like the one below into (EAV) or Date, ldap, Shift:

        Date    Ldap1   Ldap2   Ldap3 </p>
       2020-04-01   Shift A Shift B Shift C </p>
       2020-04-02   Shift A Shift B Shift C</p>
       2020-04-03   Shift A Shift B Shift C</p> 

我的初始代码(在其他地方排列)可以工作,但是当列数增加到400时,我需要执行约400次.有什么想法可以使此操作更快,更实用?

My initial code (arranged from somewhere else) works but I need to perform this around 400 times as columns go up to 400. Any ideas to make this faster and more practical?




      SELECT date,  "ldap1" AS id_name, ldap1 AS id
      FROM yt_tns_vendor_ops_ham_cog_leads.workflow_test
      UNION ALL
      SELECT date,  "ldap2" AS id_name, ldap2 AS id 
      FROM yt_tns_vendor_ops_ham_cog_leads.workflow_test
      ORDER BY date, id_name

结果:

date    id_name id
2020-04-01  Ldap1   Shift A
2020-04-01  Ldap2   Shift B
2020-04-02  Ldap1   Shift A
2020-04-02  Ldap2   Shift B
2020-04-03  Ldap1   Shift A
2020-04-03  Ldap2   Shift B
2020-04-04  Ldap1   Shift A
2020-04-04  Ldap2   Shift B
2020-04-05  Ldap1   Shift A
2020-04-05  Ldap2   Shift B
2020-04-06  Ldap1   Shift A
2020-04-06  Ldap2   Shift B
2020-04-07  Ldap1   WO
2020-04-07  Ldap2   WO
2020-04-08  Ldap1   WO
2020-04-08  Ldap2   WO
2020-04-09  Ldap1   Shift A
2020-04-09  Ldap2   Shift B
2020-04-10  Ldap1   Shift A
2020-04-10  Ldap2   Shift B
2020-04-11  Ldap1   Shift A
2020-04-11  Ldap2   Shift B
2020-04-12  Ldap1   Shift A
2020-04-12  Ldap2   Shift B
2020-04-13  Ldap1   Shift A
2020-04-13  Ldap2   Shift B
2020-04-14  Ldap1   Shift A
2020-04-14  Ldap2   Shift B
2020-04-15  Ldap1   WO
2020-04-15  Ldap2   WO
2020-04-16  Ldap1   WO
2020-04-16  Ldap2   WO
2020-04-17  Ldap1   Shift A
2020-04-17  Ldap2   Shift B
2020-04-18  Ldap1   Shift A

推荐答案

以下是BigQuery标准SQL

Below is for BigQuery Standard SQL

#standardSQL
SELECT `date`,
  SPLIT(kv, ':')[OFFSET(0)] id_name,
  SPLIT(kv, ':')[OFFSET(1)] id
FROM `project.dataset.table` t,
UNNEST(SPLIT(REGEXP_REPLACE(TO_JSON_STRING(t), r'[{}"]', ''))) kv
WHERE SPLIT(kv, ':')[OFFSET(0)] != 'date'

如以下示例所示,应用于您问题的样本数据

if to apply to sample data from your question as in below example

#standardSQL
WITH `project.dataset.table` AS (
  SELECT DATE '2020-04-01' `date`, 'Shift A' ldap1, 'Shift B' ldap2,  'Shift C' ldap3 UNION ALL
  SELECT '2020-04-02', 'Shift A', 'Shift B',  'Shift C' UNION ALL
  SELECT '2020-04-03', 'Shift A', 'Shift B',  'Shift C' 
)
SELECT `date`,
  SPLIT(kv, ':')[OFFSET(0)] id_name,
  SPLIT(kv, ':')[OFFSET(1)] id
FROM `project.dataset.table` t,
UNNEST(SPLIT(REGEXP_REPLACE(TO_JSON_STRING(t), r'[{}"]', ''))) kv
WHERE SPLIT(kv, ':')[OFFSET(0)] != 'date'

结果是

Row date        id_name id   
1   2020-04-01  ldap1   Shift A  
2   2020-04-01  ldap2   Shift B  
3   2020-04-01  ldap3   Shift C  
4   2020-04-02  ldap1   Shift A  
5   2020-04-02  ldap2   Shift B  
6   2020-04-02  ldap3   Shift C  
7   2020-04-03  ldap1   Shift A  
8   2020-04-03  ldap2   Shift B  
9   2020-04-03  ldap3   Shift C     

以上查询可以进一步重构(取决于您的喜好)

Above query can be further refactored (depends on your tastes)

#standardSQL
SELECT `date`, id_name, id
FROM `project.dataset.table` t,
UNNEST(SPLIT(REGEXP_REPLACE(TO_JSON_STRING(t), r'[{}"]', ''))) kv,
UNNEST([STRUCT(SPLIT(kv, ':')[OFFSET(0)] AS id_name, SPLIT(kv, ':')[OFFSET(1)] AS id)])
WHERE id_name != 'date'

这一部分避免了多余的部分,而冗长程度略微

this one avoids redundant parts and slightly less verbose

这篇关于使用Google SQL或BIGQUERY取消透视多个列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆