如何从Hive中的json字符串中提取选定的值 [英] How to extract selected values from json string in Hive

查看:1301
本文介绍了如何从Hive中的json字符串中提取选定的值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在Hive中运行一个简单的查询,该查询会产生以下输出(以及其他一些附加列.

|------|-----------------------------------------------------------|
| col1 | col2                                                      |
|------|-----------------------------------------------------------|
|   A  | {"variable1":123,"variable2":456,"variable3":789}         |                                          
|------|-----------------------------------------------------------|
|   B  | {"variable1":222,"variable2":333,"variable3":444}         |
--------------------------------------------------------------------

我需要能够解析json字符串并在SELECT语句本身期间提取每个令牌的值,以便我可以合并WHERE语句以仅返回字符串中对我有价值的部分. /p>

所以我的最终输出可能是这样的:

|------------------------------------------|
| col1 |variable1 | variable2 | variable3  |                                      
|------------------------------------------|
|  A   |   123    |    456    |    789     |                                    
|------------------------------------------|
|  B   |   222    |    333    |    444     |
--------------------------------------------

我尝试使用各种功能,包括使用esnaples中指定的参数结构来包括SPLIT和GET_JSON_OBJECT,但所有返回错误都诸如此类:

No matching method for class org.apache.hadoop.hive.ql.udf.UDFJson 
with (struct<...>, string). Possible choices: _FUNC_(string, string)

有人可以告诉我我想做的事是否可行,或者解释我要去哪里错了?

预先感谢

解决方案

select col1, get_json_object(col2,'$.variable1') as variable1,
get_json_object(col2,'$.variable2') as variable2,
get_json_object(col2,'$.variable3') as variable3 
from json_test

如果将输出放入表中(例如json_test),则可以通过这种方式进行解析.您也可以调整查询以获得这些结果.

输出:

col1 |variable1 |variable2 |variable3 |
-----|----------|----------|----------|
A    |123       |456       |789       |
B    |222       |333       |444       |

I am running a simple query in Hive that produces the following output (with a few other additional columns.

|------|-----------------------------------------------------------|
| col1 | col2                                                      |
|------|-----------------------------------------------------------|
|   A  | {"variable1":123,"variable2":456,"variable3":789}         |                                          
|------|-----------------------------------------------------------|
|   B  | {"variable1":222,"variable2":333,"variable3":444}         |
--------------------------------------------------------------------

I need to be able to parse the json string and pull out the values for each token during the SELECT statement itself so that I can perhaps incorporate a WHERE statement to return only the parts of the string that are valuable to me.

So my ultimate output might look like this:

|------------------------------------------|
| col1 |variable1 | variable2 | variable3  |                                      
|------------------------------------------|
|  A   |   123    |    456    |    789     |                                    
|------------------------------------------|
|  B   |   222    |    333    |    444     |
--------------------------------------------

I have tried using various functions in including SPLIT and GET_JSON_OBJECT using the argument structure specified in the esnaples yet all return errors such as:

No matching method for class org.apache.hadoop.hive.ql.udf.UDFJson 
with (struct<...>, string). Possible choices: _FUNC_(string, string)

Could someone please tell if what I am trying to do is feasible, or explain where I am going wrong?

Thanks in advance

解决方案

select col1, get_json_object(col2,'$.variable1') as variable1,
get_json_object(col2,'$.variable2') as variable2,
get_json_object(col2,'$.variable3') as variable3 
from json_test

If you put your output into a table (say json_test), you can parse in this way. You can tweak your query too to obtain these results.

Output:

col1 |variable1 |variable2 |variable3 |
-----|----------|----------|----------|
A    |123       |456       |789       |
B    |222       |333       |444       |

这篇关于如何从Hive中的json字符串中提取选定的值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆