将 Linkedin JSON 响应加载到 HIVE [英] Loading Linkedin JSON response into HIVE

查看:31
本文介绍了将 Linkedin JSON 响应加载到 HIVE的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试了多种方法来创建 HIVE 表并使用 JSONSerDe 检索数据.但这里是我遇到的错误:

I have tried multiple ways to create the HIVE table and retrieve data using JSONSerDe. But here are the errors I encounter:

hive> select * from jobs;
OK
Failed with exception java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException: j 
ava.io.EOFException: No content to map to Object due to end of input

hive> select values from jobs;

Diagnostic Messages for this Task:
java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error
while processing writable
    at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:159)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java :1408)

这里是建表语句:

create external table jobs (
 jobs STRUCT<
   values : ARRAY<STRUCT<
   id : STRING,
   customerJobCode : STRING,
   postingDate : STRING,
   expirationDate : STRING,
 company : STRUCT<
   id : STRING,
   name : STRING>,
 position : STRUCT<
   title : STRING,
   jobFunctions : STRING,
   industries : STRING,
   jobType : STRING,
   experienceLevel : STRING>,
 skillsAndExperience : ARRAY<STRING>,
 descriptionSnippet : ARRAY<STRING>,
 salary : STRING,
 jobPoster : STRUCT<
  id : STRING,
  firstName : STRING,
  lastName : STRING,
  headline : STRING>,
 referralBonus : STRING,
 locationDescription : STRING>>>
 )
ROW FORMAT SERDE 'com.cloudera.hive.serde.JSONSerDe'
LOCATION '/user/sunita/tables/jobs';

原始输入文件是 -

The raw input file is - https://gist.github.com/anonymous/e2c15d808bbe46b707bf/raw/88d775cb418901807980c52e803ffc8be53adc5f/jobsearch.json

我尝试不将值"(结构数组)添加到表描述中还尝试在输入文件和表创建语句中不使用值".这种方法没有错误,但正如人们所预料的那样,只有 1 个条目进入表,其他所有内容都为空.Hive 将其视为导致此问题的单个记录.

I tried not adding 'values' (an array of structure) to the table description Also tried without the 'values' in input file as well as table creation statement. There are no errors with this approach but as one can anticipate, only 1 entry gets into the table and everything else goes as null. Hive considers it as a single record which causes this issue.

我尝试简化输入以选择较少的字段,但在检索信息时仍然遇到相同的错误.非常感谢您在这方面的任何帮助.

I tried simplifying the input to select lesser fields, but still get the same error on retrieving the information. Any help in this regard is truly appreciated.

还使用 Notepad ++ JSON 插件确保 JSON 字符串有效.任何帮助都得到真正的感谢.

Also ensured that the JSON string is valid using the Notepad ++ JSON plugin. Any help is truly appreaciated.

推荐答案

问题在于输入文件末尾的换行符.确保我消除了数据末尾的任何字符解决了问题.

The problem was a newline at the end of the input file. Making sure that I elimiated any characters at the end of the data resolved the issue.

这篇关于将 Linkedin JSON 响应加载到 HIVE的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆