如何在BigQuery中比较具有记录类型列的两个表 [英] How to compare two tables having record type column in BigQuery
本文介绍了如何在BigQuery中比较具有记录类型列的两个表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有两个嵌套表,一个是源表,另一个是目标表。我想比较源表和目标表的嵌套列。我正在比较两个表,以检查源表中的天气数据是否正在更新。BigQuery中是否有SQL可以实现同样的功能?
以下是我以前比较具有嵌套记录的两个表的方法:
1.这是第一种方法:
SELECT to_json_string(info) FROM database.nested_table_source
except distinct
SELECT to_json_string(info) FROM nested_table_target
to_json_string()不起作用,因为此函数有时返回源行和目标行的不同序列,即使这两个表中的数据相同,它也会产生不同的记录。
2.这是第二种方法:
select name
from dataset.nested_table_source a
join dataset.nested_table_target b
using(name)
where
a.name!=b.name and
(select string_agg(format('%t', s) order by key) from a.info s)
!= (select string_agg(format('%t', s) order by key ) from b.info s)
在此方法中,我使用string_agg函数比较两个嵌套的记录。但我不确定这是否是比较记录字段的正确方式。
在这种情况下我应该怎么做?
推荐答案
这里是一种方法,在该方法中,您基本上将有序的对象集(或表中的info
列)串化,然后将它们相互比较。
下面是一些虚拟数据的示例:
with source_data as (
select
"VICTOR" as name,
array[
struct("A" as key, 3 as value),
struct("B" as key, 5 as value)
] as info
union all
select
"MAX" as name,
array[
struct("A" as key, 0 as value),
struct("B" as key, 1 as value)
] as info
union all
select
"SAIF" as name,
array[
struct("A" as key, 0 as value),
struct("B" as key, 1 as value)
] as info
),
target_data as (
select
"VICTOR" as name,
array[
struct("A" as key, 3 as value),
struct("B" as key, 15 as value)
] as info
union all
select
"MAX" as name,
array[
struct("A" as key, 0 as value),
struct("B" as key, 1 as value)
] as info
)
select name, stringified_source_set as info from (
select
s.name,
array_to_string(array(select concat(cast(x.key as string), '|', cast(x.value as string)) from unnest(t.info) as x order by x.key), '|') AS stringified_target_set,
array_to_string(array(select concat(cast(x.key as string), '|', cast(x.value as string)) from unnest(s.info) as x order by x.key), '|') AS stringified_source_set
from source_data s
left join target_data t on t.name = s.name
)
where (stringified_source_set != stringified_target_set) or (name is null)
请注意,以上方法确实同时实现了";横向比较和";(即,比较info
对象)和";纵向比较(即,比较源表中存在的目标表中缺失的条目)。
这篇关于如何在BigQuery中比较具有记录类型列的两个表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文