从sql BigQuery对象数组中获取数据 [英] fetch the data from array of objects sql BigQuery

查看:73
本文介绍了从sql BigQuery对象数组中获取数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要从数组中的第二个对象中获取键值对.另外,需要使用获取的数据创建新列.我只对第二个对象感兴趣,有些数组有3个对象,有些数组有4个,等等.数据看起来像这样:

I need to fetch key value pairs from the second object in array. Also, need to create new columns with the fetched data. I am only interested in the second object, some arrays have 3 objects, some have 4 etc. The data looks like this:

[{'adUnitCode': ca-pub, 'id': 35, 'name': ca-pub}, {'adUnitCode': hmies, 'id': 49, 'name': HMIES}, {'adUnitCode': moda, 'id': 50, 'name': moda}, {'adUnitCode': nova, 'id': 55, 'name': nova}, {'adUnitCode': listicle, 'id': 11, 'name': listicle}]
[{'adUnitCode': ca-pub, 'id': 35, 'name': ca-pub-73}, {'adUnitCode': hmiuk-jam, 'id': 23, 'name': HM}, {'adUnitCode': recipes, 'id': 26, 'name': recipes}]
[{'adUnitCode': ca-pub, 'id': 35, 'name': ca-pub-733450927}, {'adUnitCode': digital, 'id': 48, 'name': Digital}, {'adUnitCode': movies, 'id': 50, 'name': movies}, {'adUnitCode': cannes-film-festival, 'id': 57, 'name': cannes-film-festival}, {'adUnitCode': article, 'id': 57, 'name': article}]

所需的输出:

adUnitCode           id             name 
hmies                49             HMIES
hmiuk-jam            23             HM
digital              48             Digital

推荐答案

下面是BigQuery标准SQL

Below is for BigQuery Standard SQL

#standardSQL
select 
  json_extract_scalar(second_object, "$.adUnitCode") as adUnitCode,
  json_extract_scalar(second_object, "$.id") as id,
  json_extract_scalar(second_object, "$.name") as name
from `project.dataset.table`, unnest(
  [json_extract_array(regexp_replace(mapping, r"(: )([\w-]+)(,|})", "\\1'\\2'\\3"))[safe_offset(1)]]
) as second_object

如果应用于您问题中的样本数据-输出为

if applied to sample data from your question - output is

这是在regexp_replace函数中使用适当的regexp.我现在包括了所有字母字符和-.您可以根据需要添加更多内容作为替代方案,您可以尝试 regexp_replace(mapping,r(:)([^,}] +)","\\ 1'\\ 2'"),如以下示例所示-这样一来,您无需更改代码即可涵盖更多情况

as you can see, the "trick" here is to use proper regexp in regexp_replace function. I've included now any alphabetical chars and - . you can include more as you see needed As an alternative yo can try regexp_replace(mapping, r"(: )([^,}]+)", "\\1'\\2'") as in below example - so you will cover potentially more cases without changes in code

#standardSQL
select 
  json_extract_scalar(second_object, "$.adUnitCode") as adUnitCode,
  json_extract_scalar(second_object, "$.id") as id,
  json_extract_scalar(second_object, "$.name") as name
from `project.dataset.table`, unnest(
  [json_extract_array(regexp_replace(mapping, r"(: )([^,}]+)", "\\1'\\2'"))[safe_offset(1)]]
) as second_object

这篇关于从sql BigQuery对象数组中获取数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆