Hive Serde处理嵌套结构的问题 [英] issue with Hive Serde dealing nested structs
问题描述
我试图用嵌套结构加载大量的json数据以使用Json serde配置单元。一些字段名以嵌套结构中的 示例JSON: 使用以下模式: 这种模式构建成OK,但是,某个字段(以 我们一直在尝试很多语法组合,但无济于事。 有谁知道在名称中使用前导 你几乎是对的。尝试像这样创建表。 $
开始。我正在使用 SerDeproperties
映射配置单元提交的名称,但是当我查询表时,在以 $ $ c $开始的字段中获得null
$ b
{
_id:319FFE15FF90,
SomeThing:
{
$ SomeField:22,
AnotherField:2112,
YetAnotherField:1
}
。 。 。等等。 。 。 。
<$ p $ b $ create table testSample
(
`_id` string,
struct struct
<
$ somefield:int,
anotherfield:bigint,
yetanotherfield:int
>
)
行格式serde'org.openx.data.jsonserde.JsonSerDe'
with serdeproperties
(
mapping.somefield=$ somefield
);
$ $开头c $ c>)总是返回null(所有其他值都存在且正确)。
$
的嵌套字段的技巧?
您所犯的错误是,当在serde属性(mapping.somefield =$ somefield)中进行映射时,您会说当查找名为'somefield'的配置单元列时,请查找json字段' $ somefield',但是在蜂房中,您使用美元符号定义了列,如果不是完全非法的,那肯定不是蜂巢中的最佳实践。
<
$ b $ some
$ field
$ fieldfield:bigint,
yetanotherfield:int
>
)
行格式serde'org.openx.data.jsonserde.JsonSerDe'
with serdeproperties
(
mapping.somefield=$ somefield
);
我测试了它一些测试数据:
{_id:123,something:{$ somefield:12, anotherfield:13,yetanotherfield:100}}
hive>从testSample中选择something.somefield;
OK
12
I am trying to load a huge volume json data with nested structure to hive using a Json serde. some of the field names start with $
in nested structure. I am mapping hive filed names Using SerDeproperties
, but how ever when i query the table, getting null in the field starting with $
, tried with different syntax,but no luck.
Sample JSON:
{
"_id" : "319FFE15FF90",
"SomeThing" :
{
"$SomeField" : 22,
"AnotherField" : 2112,
"YetAnotherField": 1
}
. . . etc . . . .
Using a schema as follows:
create table testSample
(
`_id` string,
something struct
<
$somefield:int,
anotherfield:bigint,
yetanotherfield:int
>
)
row format serde 'org.openx.data.jsonserde.JsonSerDe'
with serdeproperties
(
"mapping.somefield" = "$somefield"
);
This schema builds OK, however, somefield(starting with $
) in the above table is always returning null (all the other values exist and are correct).
We've been trying a lot of syntax combinations, but to no avail.
Does anyone know the trick to hap a nested field with a leading $
in its name?
You almost got it right. Try creating the table like this. The mistake you're making is that when mapping in the serde properties (mapping.somefield ="$somefield") you're saying "when looking for the hive column named 'somefield', look for the json field '$somefield', but in hive you defined the column with the dollar sign, which if not outright illegal it's for sure not the best practice in hive.
create table testSample
(
`_id` string,
something struct
<
somefield:int,
anotherfield:bigint,
yetanotherfield:int
>
)
row format serde 'org.openx.data.jsonserde.JsonSerDe'
with serdeproperties
(
"mapping.somefield" = "$somefield"
);
I tested it with some test data:
{ "_id" : "123", "something": { "$somefield": 12, "anotherfield":13,"yetanotherfield":100}}
hive> select something.somefield from testSample;
OK
12
这篇关于Hive Serde处理嵌套结构的问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!