将文本字符串拆分为 presto 中的匹配列 [英] splitting a text string to matching columns in presto

查看:95
本文介绍了将文本字符串拆分为 presto 中的匹配列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一份来自 presto 查询的报告,该报告以字符串形式提供信息

I have a report from a presto query that gives me information in a string

原始数据如下所示:

c_pre=CI2UhdX95uACFcKIdwodZ8QETQ;gtm=2od241;auiddc=*;u1=cz;u10=Not
Available;u11=Not Available;u12=1;u13=Not Available;u14=SGD;u15=Not
Available;u3=pdp;u4=undefined;u6=Not Available;~oref=https://www.bbc.com/

我找到了一个 excel 解决方法,可以将其拆分为单独的列.附截图供参考

I found a excel workaround that splits this into seperate columns. screenshot attached for reference

这个过程仍然需要很长时间才能完成,我希望使用 presto 仪表板来自动化这个过程.

This process still takes quite a long time to do, and I was hoping to use the presto dashboard to automate this.

具有相同 u 前缀的所有项目(上例中的 u3、u13 等)必须进入同一列,否则我只会做一个字符串切片.不幸的是,不同的字符串根据存储的数据有不同的结果,导致同一列中的 u 前缀不同.截图显示为什么这令人困惑

All of the items with the same u prefix (u3, u13, etc in the example above ) has to go into the same column, otherwise I would just do a string slice. Unfortunately, different strings have a different result depending on the data stored, resulting in different u prefixes in the same column.Screenshot to show why this is confusing

Excel 查询:

=IFERROR(
RIGHT(
INDEX(RD!2:2,1,
MATCH('Data LU'!A$1&"="&"*",RD!2:2,0)),
(LEN(
INDEX(RD!2:2,1,
MATCH('Data LU'!A$1&"="&"*",RD!2:2,0)))-(LEN(A$1)+1))),"")

如果有帮助

推荐答案

假设你知道所有可能的前缀的集合,你可以这样做:

Assuming you know the set of all possible prefixes, you could do something like this:

with t as (select split_to_map(<column>,';','=') map from <table>)
select
    element_at(map, 'u1') as u1,
    element_at(map, 'u2') as u2,
    element_at(map, 'u3') as u3,
    ...
from t

不幸的是,目前(从版本 304 开始)无法将值动态分解为单独的列.

Unfortunately, there's currently (as of version 304) no way to explode the values into separate columns dynamically.

您可以在此处找到 split_to_mapelement_at 的文档:https://trino.io/docs/current/functions/string.html#split_to_maphttps://trino.io/docs/current/functions/map.html

You can find the documentation for split_to_map and element_at here: https://trino.io/docs/current/functions/string.html#split_to_map https://trino.io/docs/current/functions/map.html

这篇关于将文本字符串拆分为 presto 中的匹配列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆