在Hive中,如何使用explode(XPATH(..))函数读取XML中存在的NULL/空标签? [英] In Hive, how to read through NULL / empty tags present within an XML using explode(XPATH(..)) function?

查看:83
本文介绍了在Hive中,如何使用explode(XPATH(..))函数读取XML中存在的NULL/空标签?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在下面的Hive查询中,我需要读取null/空字符串"标签以及XML内容.仅非空的字符串"被使用.标签现在已在 XPATH()列表中被考虑.

In below Hive-query, I need to read the null / empty "string" tags as well, from the XML content. Only the non-null "string" tags are getting considered within the XPATH() list now.

with your_data as (
select  '<ParentArray>
    <ParentFieldArray>
        <Name>ABCD</Name>
        <Value>
            <string>111</string>
            <string></string>
            <string>222</string>
        </Value>
    </ParentFieldArray>
    <ParentFieldArray>
        <Name>EFGH</Name>
        <Value>
            <string/>
            <string>444</string>
            <string></string>
            <string>555</string>

        </Value>
    </ParentFieldArray>
</ParentArray>' as xmlinfo
)

select Name, Value 
  from your_data d
       lateral view outer explode(XPATH(xmlinfo, 'ParentArray/ParentFieldArray/Name/text()')) pf as  Name
       lateral view outer explode(XPATH(xmlinfo, concat('ParentArray/ParentFieldArray[Name="', pf.Name, '"]/Value/string/text()'))) vl as Value;

查询的预期输出:

Name    Value
ABCD    111
ABCD    
ABCD    222
EFGH    
EFGH    444
EFGH    
EFGH    555

推荐答案

这里的问题是 XPATH 返回NodeList,如果它包含空节点,则不包含在列表中.

The problem here is that XPATH returns NodeList and if it contains empty node, it is not included in the list.

与某些字符串(在XPATH中)串联: concat(/Value/string/text(),")在这里不起作用:

Concatenation with some string (in XPATH): concat(/Value/string/text()," ") does not work here:

由以下原因引起:javax.xml.xpath.XPathExpressionException:com.sun.org.apache.xpath.internal.XPathException:无法转换#STRING到NodeList!

Caused by: javax.xml.xpath.XPathExpressionException: com.sun.org.apache.xpath.internal.XPathException: Can not convert #STRING to a NodeList!

在com.sun.org.apache.xpath.internal.jaxp.XPathExpressionImpl.evaluate(XPathExpressionImpl.java:195)

at com.sun.org.apache.xpath.internal.jaxp.XPathExpressionImpl.evaluate(XPathExpressionImpl.java:195)

简便的解决方案是将< string></string> < string/> 替换为< string> NULL</string>; ,然后您可以将'NULL'字符串转换为null.

Easy solution is to replace <string></string> and <string/> with <string>NULL</string> and then you can convert 'NULL' string to null.

演示:

with your_data as (
select  '<ParentArray>
    <ParentFieldArray>
        <Name>ABCD</Name>
        <Value>
            <string>111</string>
            <string></string>
            <string>222</string>
        </Value>
    </ParentFieldArray>
    <ParentFieldArray>
        <Name>EFGH</Name>
        <Value>
            <string/>
            <string>444</string>
            <string></string>
            <string>555</string>
        </Value>
    </ParentFieldArray>
</ParentArray>' as xmlinfo
)

select name, case when value='NULL' then null else value end value
  from (select regexp_replace(xmlinfo,'<string></string>|<string/>','<string>NULL</string>') xmlinfo 
          from your_data d
       ) d
       lateral view outer explode(XPATH(xmlinfo, 'ParentArray/ParentFieldArray/Name/text()')) pf as  Name
       lateral view outer explode(XPATH(xmlinfo, concat('ParentArray/ParentFieldArray[Name="', pf.Name, '"]/Value/string/text()'))) vl as value

结果:

name    value
ABCD    111
ABCD    
ABCD    222
EFGH    
EFGH    444
EFGH    
EFGH    555

这篇关于在Hive中,如何使用explode(XPATH(..))函数读取XML中存在的NULL/空标签?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆