使用Hive SQL提取不同字符之间的字符串 [英] Extracting strings between distinct characters using hive SQL

查看：980 发布时间：2021/5/13 20:15:01 hadoop hive hiveql

本文介绍了使用Hive SQL提取不同字符之间的字符串的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个名为geo_data_display的字段，其中包含国家，地区和dma.这三个值包含在=和&之间.字符-第一个"="和第一个&"之间的国家/地区，第二个"="和第二个&"之间的区域和第三个"="和第三个&"之间的DMA.这是表格的可复制版本.国家/地区始终是字符，但地区和DMA可以是数字或字符，并且DMA并非在所有国家/地区都存在.

I have a field called geo_data_display which contains country, region and dma. The 3 values are contained between = and & characters - country between the first "=" and the first "&", region between the second "=" and the second "&" and DMA between the third "=" and the third "&". Here's a re-producible version of the table. country is always character but region and DMA can be either numeric or character and DMA doesn't exist for all countries.

一些样本值是:

country=us&region=tx&dma=625&domain=abc.net&zipcodes=76549
country=us&region=ca&dma=803&domain=abc.com&zipcodes=90404 
country=tw&region=hsz&domain=hinet.net&zipcodes=300
country=jp&region=1&dma=a&domain=hinet.net&zipcodes=300

我有一些示例SQL，但是geo_dma代码行根本不起作用，geo_region代码行仅适用于字符值

I have some sample SQL but the geo_dma code line isn't working at all and the geo_region code line only works for character values

SELECT 

UPPER(REGEXP_REPLACE(split(geo_data_display, '\\&')[0], 'country=', '')) AS geo_country
,UPPER(split(split(geo_data_display, '\\&')[1],'\\=')[1]) AS geo_region
,split(split(cast(geo_data_display as int), '\\&')[2],'\\=')[2] AS geo_dma
FROM mytable

推荐答案

源

regexp_extract(字符串主题，字符串模式，整数索引)

返回使用模式提取的字符串.例如，regexp_extract('foothebar'，'foo(.*?)(bar)'，1)返回'the'

Returns the string extracted using the pattern. For example, regexp_extract('foothebar', 'foo(.*?)(bar)', 1) returns 'the'

select 
      regexp_extract(geo_data_display, 'country=(.*?)(&region)', 1),
      regexp_extract(geo_data_display, 'region=(.*?)(&dma)', 1),
      regexp_extract(geo_data_display, 'dma=(.*?)(&domain)', 1)

这篇关于使用Hive SQL提取不同字符之间的字符串的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用Hive SQL提取不同字符之间的字符串 [英] Extracting strings between distinct characters using hive SQL

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

使用Hive SQL提取不同字符之间的字符串 [英] Extracting strings between distinct characters using hive SQL

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭