在Hive中,如何使用explode(XPATH(..))函数读取XML中存在的NULL/空标签? [英] In Hive, how to read through NULL / empty tags present within an XML using explode(XPATH(..)) function?
问题描述
在下面的Hive查询中,我需要读取null/空字符串"标签以及XML内容.仅非空的字符串"被使用.标签现在已在 XPATH()
列表中被考虑.
In below Hive-query, I need to read the null / empty "string" tags as well, from the XML content. Only the non-null "string" tags are getting considered within the XPATH()
list now.
with your_data as (
select '<ParentArray>
<ParentFieldArray>
<Name>ABCD</Name>
<Value>
<string>111</string>
<string></string>
<string>222</string>
</Value>
</ParentFieldArray>
<ParentFieldArray>
<Name>EFGH</Name>
<Value>
<string/>
<string>444</string>
<string></string>
<string>555</string>
</Value>
</ParentFieldArray>
</ParentArray>' as xmlinfo
)
select Name, Value
from your_data d
lateral view outer explode(XPATH(xmlinfo, 'ParentArray/ParentFieldArray/Name/text()')) pf as Name
lateral view outer explode(XPATH(xmlinfo, concat('ParentArray/ParentFieldArray[Name="', pf.Name, '"]/Value/string/text()'))) vl as Value;
查询的预期输出:
Name Value
ABCD 111
ABCD
ABCD 222
EFGH
EFGH 444
EFGH
EFGH 555
推荐答案
这里的问题是 XPATH
返回NodeList,如果它包含空节点,则不包含在列表中.
The problem here is that XPATH
returns NodeList and if it contains empty node, it is not included in the list.
与某些字符串(在XPATH中)串联: concat(/Value/string/text(),")
在这里不起作用:
Concatenation with some string (in XPATH): concat(/Value/string/text()," ")
does not work here:
由以下原因引起:javax.xml.xpath.XPathExpressionException:com.sun.org.apache.xpath.internal.XPathException:无法转换#STRING到NodeList!
Caused by: javax.xml.xpath.XPathExpressionException: com.sun.org.apache.xpath.internal.XPathException: Can not convert #STRING to a NodeList!
在com.sun.org.apache.xpath.internal.jaxp.XPathExpressionImpl.evaluate(XPathExpressionImpl.java:195)
at com.sun.org.apache.xpath.internal.jaxp.XPathExpressionImpl.evaluate(XPathExpressionImpl.java:195)
简便的解决方案是将< string></string>
和< string/>
替换为< string> NULL</string>;
,然后您可以将'NULL'字符串转换为null.
Easy solution is to replace <string></string>
and <string/>
with <string>NULL</string>
and then you can convert 'NULL' string to null.
演示:
with your_data as (
select '<ParentArray>
<ParentFieldArray>
<Name>ABCD</Name>
<Value>
<string>111</string>
<string></string>
<string>222</string>
</Value>
</ParentFieldArray>
<ParentFieldArray>
<Name>EFGH</Name>
<Value>
<string/>
<string>444</string>
<string></string>
<string>555</string>
</Value>
</ParentFieldArray>
</ParentArray>' as xmlinfo
)
select name, case when value='NULL' then null else value end value
from (select regexp_replace(xmlinfo,'<string></string>|<string/>','<string>NULL</string>') xmlinfo
from your_data d
) d
lateral view outer explode(XPATH(xmlinfo, 'ParentArray/ParentFieldArray/Name/text()')) pf as Name
lateral view outer explode(XPATH(xmlinfo, concat('ParentArray/ParentFieldArray[Name="', pf.Name, '"]/Value/string/text()'))) vl as value
结果:
name value
ABCD 111
ABCD
ABCD 222
EFGH
EFGH 444
EFGH
EFGH 555
这篇关于在Hive中,如何使用explode(XPATH(..))函数读取XML中存在的NULL/空标签?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!