Pig : json loader 的结果为空 [英] Pig : result of json loader empty
本文介绍了Pig : json loader 的结果为空的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我正在使用 cdh5 quickstart vm 并且我有一个这样的文件(此处未满):
I'm using cdh5 quickstart vm and I have a file like this(not full here):
{"user_id": "kim95",
"type": "Book",
"title": "Modern Database Systems: The Object Model, Interoperability, and
Beyond.",
"year": "1995",
"publisher": "ACM Press and Addison-Wesley",
"authors": {},
"source": "DBLP"
}
{"user_id": "marshallo79",
"type": "Book",
"title": "Inequalities: Theory of Majorization and Its Application.",
"year": "1979",
"publisher": "Academic Press",
"authors": {("Albert W. Marshall"), ("Ingram Olkin")},
"source": "DBLP"
}
<小时>
我使用了这个脚本:
and I used this script:
books = load 'data/book-seded.json'
using JsonLoader('t1:tuple(user_id:
chararray,type:chararray,title:chararray,year:chararray,publisher:chararray,source:chararray,authors:bag{T:tuple(author:chararray)})');
STORE books INTO 'book-no-seded.tsv';
脚本可以运行,但是生成的文件是空的,你知道吗?
the script works , but the generated file is empty, do you have any idea?
推荐答案
最后,只有这个模式有效:如果我添加或删除与此配置不同的空格,那么我会出现错误(我还为元组添加了名称"并在为空时指定null",并更改作者和来源之间的顺序,但即使没有此配置,它仍然会出错)
Finally , only this schema worked : If I add or remove a space different from this configuration then i gonna have an error( i also added "name" for tuples and specified "null" when it was empty, and changed the order between authors and source, but even without this congiguration it will still be wrong)
{"user_id": "kim95", "type": "Book","title": "Modern Database Systems: The Object Model, Interoperability, and Beyond.", "year": "1995", "publisher": "ACM Press and Addison-Wesley", "authors": [{"name":null"}], "source": "DBLP"}
{"user_id": "marshallo79", "type": "Book", "title": "Inequalities: Theory of Majorization and Its Application.", "year": "1979", "publisher": "Academic Press", "authors": [{"name":"Albert W. Marshall"},{"name":"Ingram Olkin"}], "source": "DBLP"}
工作脚本是这样的:
books = load 'data/book-seded-workings-reduced.json'
using JsonLoader('user_id:chararray,type:chararray,title:chararray,year:chararray,publisher:chararray,authors:{(name:chararray)},source:chararray');
STORE books INTO 'book-table.csv'; //whether .tsv or .csv
这篇关于Pig : json loader 的结果为空的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文