如何在BigQuery中删除/更新嵌套数据 [英] How to delete/update nested data in bigquery

查看:94
本文介绍了如何在BigQuery中删除/更新嵌套数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有没有一种方法可以删除/更新bigquery中的嵌套字段?

Is there a way to delete/update nested field in bigquery?

假设我有此数据

wives.age   wives.name  name     
21             angel    adam     
20             kale      
21           victoria   rossi    
20           jessica         

或在json中

{"name":"adam","wives":[{"name":"angel","age":21},{"name":"kale","age":20}]}
{"name":"rossi","wives":[{"name":"victoria","age":21},{"name":"jessica","age":20}]}

从上面的数据中可以看到. 亚当有两个妻子,分别是天使和羽衣甘蓝.如何:

As you can see from the data above. Adam has 2 wives, named angel and kale. How to:

  1. 删除羽衣甘蓝记录.
  2. 将jessica更新为dessica

我尝试用Google搜索它,但找不到它.我也试着去打扰,等等,但是没有运气.

I tried to google this, but can't find it. I also tried to unnest, etc but no luck.

之所以要这样做,是因为我们将数组插入到错误的记录中,并希望在某种条件下删除/更新数组数据.

The reason why we want to do this is because we insert the array to the wrong records and want to remove/update array data with some condition.

推荐答案

以下是BigQuery标准SQL

Below is for BigQuery Standard SQL

#standardSQL
WITH updates AS (
  SELECT 'rossi' name, 'jessica' oldname, 'dessica' newname UNION ALL
  SELECT 'rossi' name, 'victoria' oldname, 'polly' newname UNION ALL
  SELECT 'adam' name, 'angel' oldname, 'jen' newname 
), divorces AS (
  SELECT 'adam' name, 'kale' wifename UNION ALL
  SELECT 'adam' name, 'milly' wifename UNION ALL
  SELECT 'rossi' name, 'linda' wifename      
)
SELECT t.name, 
  ARRAY(
    SELECT AS STRUCT 
      age, 
      CASE 
        WHEN NOT oldname IS NULL THEN newname
        ELSE name 
      END name
    FROM UNNEST(wives)
    LEFT JOIN UNNEST(updates) ON t.name = u.name AND name = oldname
    LEFT JOIN UNNEST(divorces) AS wifename ON t.name = d.name AND name = wifename
    WHERE wifename IS NULL
  ) waves
FROM `project.dataset.table` t
LEFT JOIN (
  SELECT name, ARRAY_AGG(STRUCT(oldname, newname)) updates
  FROM updates GROUP BY name
  ) u ON t.name = u.name
LEFT JOIN (
  SELECT name, ARRAY_AGG(wifename) divorces
  FROM divorces GROUP BY name
  ) d ON t.name = d.name

您可以使用以下伪数据来测试/玩

You can test / play with above using dummy data as below

#standardSQL
WITH `project.dataset.table` AS (
  SELECT 'adam' name, [STRUCT<age INT64, name STRING>(21, 'angel'), (20, 'kale'), (22, 'milly')] wives UNION ALL
  SELECT 'rossi', [STRUCT<age INT64, name STRING>(21, 'victoria'), (20, 'jessica'), (23, 'linda')]
), updates AS (
  SELECT 'rossi' name, 'jessica' oldname, 'dessica' newname UNION ALL
  SELECT 'rossi' name, 'victoria' oldname, 'polly' newname UNION ALL
  SELECT 'adam' name, 'angel' oldname, 'jen' newname 
), divorces AS (
  SELECT 'adam' name, 'kale' wifename UNION ALL
  SELECT 'adam' name, 'milly' wifename UNION ALL
  SELECT 'rossi' name, 'linda' wifename      
)
SELECT t.name, 
  ARRAY(
    SELECT AS STRUCT 
      age, 
      CASE 
        WHEN NOT oldname IS NULL THEN newname
        ELSE name 
      END name
    FROM UNNEST(wives)
    LEFT JOIN UNNEST(updates) ON t.name = u.name AND name = oldname
    LEFT JOIN UNNEST(divorces) AS wifename ON t.name = d.name AND name = wifename
    WHERE wifename IS NULL
  ) waves
FROM `project.dataset.table` t
LEFT JOIN (
  SELECT name, ARRAY_AGG(STRUCT(oldname, newname)) updates
  FROM updates GROUP BY name
  ) u ON t.name = u.name
LEFT JOIN (
  SELECT name, ARRAY_AGG(wifename) divorces
  FROM divorces GROUP BY name
  ) d ON t.name = d.name

结果符合预期

name    waves.age   waves.name   
adam    21          jen  
rossi   21          polly    
        20          dessica  

我希望您能够将以上内容应用于您的真实案例:o)

I hope you will be able to apply above to your real case :o)

这篇关于如何在BigQuery中删除/更新嵌套数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆