将 Linkedin JSON 响应加载到 HIVE [英] Loading Linkedin JSON response into HIVE
问题描述
我尝试了多种方法来创建 HIVE 表并使用 JSONSerDe 检索数据.但这里是我遇到的错误:
I have tried multiple ways to create the HIVE table and retrieve data using JSONSerDe. But here are the errors I encounter:
hive> select * from jobs;
OK
Failed with exception java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException: j
ava.io.EOFException: No content to map to Object due to end of input
hive> select values from jobs;
Diagnostic Messages for this Task:
java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error
while processing writable
at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:159)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java :1408)
这里是建表语句:
create external table jobs (
jobs STRUCT<
values : ARRAY<STRUCT<
id : STRING,
customerJobCode : STRING,
postingDate : STRING,
expirationDate : STRING,
company : STRUCT<
id : STRING,
name : STRING>,
position : STRUCT<
title : STRING,
jobFunctions : STRING,
industries : STRING,
jobType : STRING,
experienceLevel : STRING>,
skillsAndExperience : ARRAY<STRING>,
descriptionSnippet : ARRAY<STRING>,
salary : STRING,
jobPoster : STRUCT<
id : STRING,
firstName : STRING,
lastName : STRING,
headline : STRING>,
referralBonus : STRING,
locationDescription : STRING>>>
)
ROW FORMAT SERDE 'com.cloudera.hive.serde.JSONSerDe'
LOCATION '/user/sunita/tables/jobs';
The raw input file is - https://gist.github.com/anonymous/e2c15d808bbe46b707bf/raw/88d775cb418901807980c52e803ffc8be53adc5f/jobsearch.json
我尝试不将值"(结构数组)添加到表描述中还尝试在输入文件和表创建语句中不使用值".这种方法没有错误,但正如人们所预料的那样,只有 1 个条目进入表,其他所有内容都为空.Hive 将其视为导致此问题的单个记录.
I tried not adding 'values' (an array of structure) to the table description Also tried without the 'values' in input file as well as table creation statement. There are no errors with this approach but as one can anticipate, only 1 entry gets into the table and everything else goes as null. Hive considers it as a single record which causes this issue.
我尝试简化输入以选择较少的字段,但在检索信息时仍然遇到相同的错误.非常感谢您在这方面的任何帮助.
I tried simplifying the input to select lesser fields, but still get the same error on retrieving the information. Any help in this regard is truly appreciated.
还使用 Notepad ++ JSON 插件确保 JSON 字符串有效.任何帮助都得到真正的感谢.
Also ensured that the JSON string is valid using the Notepad ++ JSON plugin. Any help is truly appreaciated.
推荐答案
问题在于输入文件末尾的换行符.确保我消除了数据末尾的任何字符解决了问题.
The problem was a newline at the end of the input file. Making sure that I elimiated any characters at the end of the data resolved the issue.
这篇关于将 Linkedin JSON 响应加载到 HIVE的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!