如何将字符串转换为蜂巢中爆炸的结构数组? [英] How to Convert string to array of struct in hive and explode?

查看:97
本文介绍了如何将字符串转换为蜂巢中爆炸的结构数组?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在蜂巢中有以下格式的数据。 table test(seq string,result string);

  | seq |结果| 
---------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- ------
| 0001 | [{ OFFERID: Default_XYZ, BUSINESSNAME: 苹果, businessGroup: 默认, businessIssue: 默认, interactionId: - 4930126168287369915, CAMPAIGNID:P- 1\" , 等级: 1},{ OFFERID: Default_NAV, BUSINESSNAME: 橙, businessGroup: 默认, businessIssue: 默认, interactionId: -7830126168223452134,campaignID:P-1,rank:2}]

输出应该像

  | seq | offerId | businessName | businsesGroup | businessIssue | interactionId | campaignId |等级| 
---------------------------------------------- -------------------------------------------------- ----------------
| 0001 | Default_XYZ | Apple |默认|默认| -4930126168287369915 | P-1 | 1 |
| 0001 | Default_NAV |橙色|默认|默认| -7830126168223452134 | P-1 | 2 |

我尝试将字符串转换为结构数组,但它不适用于直接CAST。



有任何帮助吗?

  select orderNumber,offerId,businessName,rank from(

select sequenceNumber,
collect_list(oid ['offerId'])as offerid_list
,collect_list(oid ['businessName'])as businessName_list
,collect_list(oid ['rank'])as rank_list
from(
select sequenceNumber,
str_to_map(translate ()()()()()()()()()() \\\},))oid as offer_id
)x
by sequenceNumber

)y横向视图爆炸(offerid_list)olist as offerId
横向视图爆炸(businessName_list)olist作为businessName
横向视图explode(rank_list)rlis t为等级


解决方案

/ p>

  select 
seq,
split(split(results,,)[0],': ')[1] as offerId,
split(split(results,,)[1],':')[1] as businessName,
split(split(results,,) [2],':')[1]作为businessGroup,
split(split(results,,)[3],':')[1]作为businessIssue,
split(split(split结果,,)[4],':')[1] as interactiveId,
split(split(results,,)[5],':')[1] as campignId
regexp_replace(split(split(results,,)[6],:)[1],[\\ | |]],)作为等级



select seq,
split(translate(result),''\\ [|] | \',''),},)as r
from test
)t1
LATERAL VIEW爆炸(r)rr AS结果


I have data in below format in hive. table test(seq string, result string);

|seq  | result                                                                                                                                                                                                                                                                                                                                                 |
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|0001 | [{"offerId":"Default_XYZ","businessName":"Apple","businessGroup":"Default","businessIssue":"Default","interactionId":"-4930126168287369915","campaignID":"P-1","rank":"1"},{"offerId":"Default_NAV","businessName":"Orange","businessGroup":"Default","businessIssue":"Default","interactionId":"-7830126168223452134","campaignID":"P-1","rank":"2"}] |

Output should be like

|seq  | offerId     | businessName   | businsesGroup| businessIssue | interactionId        | campaignId | rank |
----------------------------------------------------------------------------------------------------------------
|0001 | Default_XYZ | Apple          | Default      | Default       | -4930126168287369915 | P-1        | 1    |
|0001 | Default_NAV | Orange         | Default      | Default       | -7830126168223452134 | P-1        | 2    |

I tried to convert string to Array of struct, but it didnt work with direct CAST.

Any help please?

[EDIT - Tried below query]

 select sequenceNumber, offerId, businessName, rank from (

 select sequenceNumber,
        collect_list(oid['offerId']) as offerid_list
       , collect_list(oid['businessName']) as businessName_list
        ,collect_list(oid['rank']) as rank_list
  from (
 select sequenceNumber,
        str_to_map(translate(offer_Id,'{}','')) as oid

        from test
        lateral view explode (split(translate(result, '[]"',''),"\\},")) oid as offer_id
    ) x
    group by sequenceNumber

      ) y lateral view explode(offerid_list) olist as offerId
      lateral view explode(businessName_list) olist as businessName
      lateral view explode(rank_list) rlist as rank

解决方案

Found one solution to my question:

select                                                   
seq, 
split(split(results,",")[0],':')[1] as offerId,
split(split(results,",")[1],':')[1] as businessName,
split(split(results,",")[2],':')[1] as businessGroup,
split(split(results,",")[3],':')[1] as businessIssue,
split(split(results,",")[4],':')[1] as interactionId,
split(split(results,",")[5],':')[1] as campignId
regexp_replace(split(split(results,",")[6],":")[1], "[\\]|}]", "") as  rank

from
(
  select seq,
     split(translate(result), '"\\[|]|\""',''), "},") as r
      from test  
) t1
LATERAL VIEW explode(r) rr AS results

这篇关于如何将字符串转换为蜂巢中爆炸的结构数组?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆