重组表并检查值 [英] Restructure table and check for values
问题描述
我有一个表(表1),看起来像下面-
I have one table (Table 1) which looks like below-
keys
AAB12B34
CC34DE5W
SEF5C6T4
SQA7ZZ87
LM24NO3P
X34YY78Z
还有另一个如下表(表2)-
And another table (Table 2) which looks like below-
category_id category_name associated_keys
111 Books CC34DE5W|SQA7ZZ87|LM24NO3P
222 Office LM24NO3P|AAB12B34
444 Furniture X34YY78Z|LM24NO3P|SQA7ZZ87|SEF5C6T4|CC34DE5W|AAB12B34
222 Office X34YY78Z
我想完成2个任务-
任务1:
在任何给定的点上,每个category_id我只希望有一行.如果有2行(意味着如果ID重复),那么我想按category_id分组并添加associated_keys.因此,该表的输出应该看起来像-
At any given point I want to have only one row for each category_id. If there are 2 rows (meaning if the id is repeated) then I want to group by category_id and add the associated_keys. So, out put of this table should look like-
表3-
category_id category_name associated_keys
111 Books CC34DE5W|SQA7ZZ87|LM24NO3P
222 Office LM24NO3P|AAB12B34|X34YY78Z
444 Furniture X34YY78Z|LM24NO3P|SQA7ZZ87|SEF5C6T4|CC34DE5W|AAB12B34
任务2:
然后,我想将表1 中键字段内的所有值转换为列名,然后检查表2 .如果特定的 category_id 在 associated_keys 字段中具有该键,那么我想在该键字段下添加1,否则插入0.
Then, I want to convert all values inside the keys field in Table 1 into column names and then check Table 2. If a particular category_id has that key in associated_keys field then I want to add 1 underneath that key field else insert 0.
最后,结果看起来像-
category_id category_name AAB12B34 CC34DE5W SEF5C6T4 SQA7ZZ87 LM24NO3P X34YY78Z
111 Books 0 1 0 1 1 0
222 Office 1 0 0 0 1 1
444 Furniture 1 1 1 1 1 1
推荐答案
以下内容适用于BigQuery Standard SQL,并积累了有关您案例的所有细微差别-希望您能够制作出适合您的新版本(请注意:它实际上与我最初的回答是相同的逻辑,但是对地址列名称以数字开头,在Table1中使用正确的列名称等进行了一些更正)
Below is for BigQuery Standard SQL and accumulate all new nuances about your case - hope you will be able to make this new cleaned version to work for you (note: it is actually the same logic as in my initial answer, but with some correction to address column names starting with digits, using correct column name in Table1, etc)
DECLARE statement STRING;
SET statement = (
WITH task1 AS (
SELECT category_id, category_name, STRING_AGG(associated_keys, '|') AS associated_keys
FROM `your_project.your_dataset.data`
GROUP BY category_id, category_name
)
SELECT '''
SELECT
category_id,
category_name, ''' || (
SELECT STRING_AGG('IF("' || ids || '" IN UNNEST(keys), 1, 0) AS col_' || ids, ', ')
FROM `your_project.your_dataset.keys`
) || ''' FROM task1, UNNEST([STRUCT(SPLIT(associated_keys, "|") AS keys)])'''
);
EXECUTE IMMEDIATE '''
WITH task1 AS (
SELECT category_id, category_name, STRING_AGG(associated_keys, '|') AS associated_keys
FROM `your_project.your_dataset.data`
GROUP BY category_id, category_name
)''' || statement;
以上假设下面的表格结构(和示例数据)
Above assumes below tables structure (and sample data)
`your_project.your_dataset.keys` AS (
SELECT 'AAB12B34' ids UNION ALL
SELECT '34DE5WCC' UNION ALL
SELECT 'SEF5C6T4' UNION ALL
SELECT 'SQA7ZZ87' UNION ALL
SELECT '24NO3PLM' UNION ALL
SELECT 'X34YY78Z'
), `your_project.your_dataset.data` AS (
SELECT 111 category_id, 'Books' category_name, '34DE5WCC|SQA7ZZ87|24NO3PLM|SQA7ZZ87|sample300|sample300' associated_keys UNION ALL
SELECT 222, 'Office', '24NO3PLM|AAB12B34|X34YY78Z' UNION ALL
SELECT 444, 'Furniture', 'X34YY78Z|24NO3PLM|SQA7ZZ87|SEF5C6T4|34DE5WCC|AAB12B34|sample200' UNION ALL
SELECT 222, 'Office', 'X34YY78Z|sample100' UNION ALL
SELECT 111, 'Books', 'AAB12B34'
)
如果将代码应用于上述示例数据
if to apply code to above sample data
结果是
Row category_id category_name col_AAB12B34 col_34DE5WCC col_SEF5C6T4 col_SQA7ZZ87 col_24NO3PLM col_X34YY78Z
1 111 Books 1 1 0 1 1 0
2 222 Office 1 0 0 0 1 1
3 444 Furniture 1 1 1 1 1 1
这篇关于重组表并检查值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!