在Pig中解析复杂的JSON字符串 [英] Parse Complex JSON String in Pig

查看:200
本文介绍了在Pig中解析复杂的JSON字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在Pig中解析一串复杂的JSON。具体来说,我希望Pig将我的JSON数组理解为一个包而不是一个单独的chararray。使用JsonLoader时,我可以通过指定模式轻松完成此任务,如此问题 。有什么办法可以让Pig找出我的模式,或者当Pig解析一个字符串时指定它吗?我一直在使用 JsonStringToMap ,但无法找到指定Schema的方法,或者让它正确理解我的JSON数组是一个数组而不是单个chararray。

I want to parse a string of complex JSON in Pig. Specifically, I want Pig to understand my JSON array as a bag instead of as a single chararray. When using JsonLoader, I can do this easily by specifying the schema, as in this question. Is there any way to either have Pig figure out my schema for me, or to specify it when Pig is parsing a string? I've been using JsonStringToMap, but can't find a way to specify Schema, or to have it properly understand my JSON array is an array and not a single chararray.

推荐答案

我使用 JsonTupleMap() Mozilla's猪的Akela图书馆。它通过解析我所有的JSON来完成我想要的东西,即使它非常复杂,甚至在我不提供模式时也会这样做。如果您遇到与我一样的问题,请使用它。

I wound up using JsonTupleMap() in Mozilla's Akela library for pig. It accomplishes exactly what I want by parsing all of my JSON even when it's complex, and doing this even when I don't provide a schema. If you run into the same problem as me, use that.

示例用法:

Example usage:

REGISTER '/path/to/akela-0.5-SNAPSHOT.jar';
DEFINE JsonTupleMap com.mozilla.pig.eval.json.JsonTupleMap();
loaded = LOAD '$INPUT' AS (json_string:chararray, ...);
jsonified = FOREACH loaded GENERATE JsonTupleMap(json_string) AS json:map[], ...;
some_generate = FOREACH jsonified GENERATE json#'key'#'sub_key';

这篇关于在Pig中解析复杂的JSON字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆