如何为每 6 个字符拆分列数据并在 BigQuery 中形成一行 [英] how to split column data for each 6 chars and to form a rows in BigQuery
问题描述
我需要为每 6 个字符拆分列数据.
I have a requirement to split the column data for each 6 characters.
Input:
+----+----------------------+
|col1| col2|
+----+----------------------+
|d1|X11 F11 1000KG123456|
|d2|X22 F22 3500Kabcdefgh|
Expecting:
+------------+
|col1|col2|
+------------+
|d1| X11|
|d1| F11|
|d1| 1000KG|
|d1| 123456|
|d2| X22|
|d2| F22|
|d2| 3500Ka|
|d2| bcdefg|
|d2| h|
我需要通用查询而不是硬编码查询.我的表中有大量数据.我试过下面的查询它没有用.
I require a generic query not a hard coded ones please. I have huge data in my table. I have tried with below query it did not worked.
with mytable as
(select col1,col2 from `table_name`)
select col1, c2
from mytable, unnest(SPLIT(col2, '(?<=\G......)')) as c2
其中 '(?<=G......)' 是 spark 中使用的正则表达式,相同的正则表达式在 bigquery 中不起作用.请帮助我很快需要在生产中的 bigquery 中实现它.
where '(?<=G......)' is the regex used in spark, the same regex is not working in bigquery. Please help I need to implement this in bigquery in production soon.
推荐答案
试试下面的查询,它会按预期工作
Try the following query, it will work as expected
with mytable as
(select col1,col2 from `table_name`)
select col1, c2
from mytable, unnest(REGEXP_EXTRACT_ALL(string_field_1, '.{6}')) as c2
更多详情请参考
https://github.com/google/re2/wiki/Syntax
https://cloud.google.com/bigquery/docs/reference/standard-sql/string_functions#regexp_extract_all
这篇关于如何为每 6 个字符拆分列数据并在 BigQuery 中形成一行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!