如何在每6个字符中拆分列数据并在BigQuery中形成行 [英] how to split column data for each 6 chars and to form a rows in BigQuery

查看:41
本文介绍了如何在每6个字符中拆分列数据并在BigQuery中形成行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要将列数据每6个字符拆分一次.

I have a requirement to split the column data for each 6 characters.

Input:
+----+----------------------+
|col1|                  col2|
+----+----------------------+
|d1|X11   F11   1000KG123456|
|d2|X22   F22   3500Kabcdefgh|

Expecting:
+------------+
|col1|col2|
+------------+
|d1|     X11|
|d1|     F11|
|d1|  1000KG|
|d1|  123456|
|d2|     X22|
|d2|     F22|
|d2|  3500Ka|
|d2|  bcdefg|
|d2|       h|

我需要一个通用查询,而不是一个硬编码的查询.我的桌子上有大量数据.我尝试使用下面的查询来解决问题.

I require a generic query not a hard coded ones please. I have huge data in my table. I have tried with below query it did not worked.

with mytable as 
(select col1,col2 from `table_name`)
select col1, c2
from mytable, unnest(SPLIT(col2, '(?<=\\G......)')) as c2

其中'(?< = \ G ......)'是spark中使用的正则表达式,相同的正则表达式在bigquery中不起作用.请帮助我尽快在生产中的bigquery中实现此功能.

where '(?<=\G......)' is the regex used in spark, the same regex is not working in bigquery. Please help I need to implement this in bigquery in production soon.

推荐答案

尝试以下查询,它将按预期运行

Try the following query, it will work as expected

with mytable as 
(select col1,col2 from `table_name`)
select col1, c2
from mytable, unnest(REGEXP_EXTRACT_ALL(string_field_1, '.{6}')) as c2

有关更多详细信息,请参见

For more details refer

https://github.com/google/re2/wiki/语法

https://cloud.google.com/bigquery/docs/reference/standard-sql/string_functions#regexp_extract_all

这篇关于如何在每6个字符中拆分列数据并在BigQuery中形成行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆