如何为每 6 个字符拆分列数据并在 BigQuery 中形成一行 [英] how to split column data for each 6 chars and to form a rows in BigQuery

查看:16
本文介绍了如何为每 6 个字符拆分列数据并在 BigQuery 中形成一行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要为每 6 个字符拆分列数据.

I have a requirement to split the column data for each 6 characters.

Input:
+----+----------------------+
|col1|                  col2|
+----+----------------------+
|d1|X11   F11   1000KG123456|
|d2|X22   F22   3500Kabcdefgh|

Expecting:
+------------+
|col1|col2|
+------------+
|d1|     X11|
|d1|     F11|
|d1|  1000KG|
|d1|  123456|
|d2|     X22|
|d2|     F22|
|d2|  3500Ka|
|d2|  bcdefg|
|d2|       h|

我需要通用查询而不是硬编码查询.我的表中有大量数据.我试过下面的查询它没有用.

I require a generic query not a hard coded ones please. I have huge data in my table. I have tried with below query it did not worked.

with mytable as 
(select col1,col2 from `table_name`)
select col1, c2
from mytable, unnest(SPLIT(col2, '(?<=\G......)')) as c2

其中 '(?<=G......)' 是 spark 中使用的正则表达式,相同的正则表达式在 bigquery 中不起作用.请帮助我很快需要在生产中的 bigquery 中实现它.

where '(?<=G......)' is the regex used in spark, the same regex is not working in bigquery. Please help I need to implement this in bigquery in production soon.

推荐答案

试试下面的查询,它会按预期工作

Try the following query, it will work as expected

with mytable as 
(select col1,col2 from `table_name`)
select col1, c2
from mytable, unnest(REGEXP_EXTRACT_ALL(string_field_1, '.{6}')) as c2

更多详情请参考

https://github.com/google/re2/wiki/Syntax

https://cloud.google.com/bigquery/docs/reference/standard-sql/string_functions#regexp_extract_all

这篇关于如何为每 6 个字符拆分列数据并在 BigQuery 中形成一行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆