从本地存储文件导入数据后,HIVE查询返回空值 [英] HIVE Query returning null values after import data from local stored file

查看:339
本文介绍了从本地存储文件导入数据后,HIVE查询返回空值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是Hive的新手,所以如果我的问题是noobies,请保持温柔: - )

我使用以下hive语句创建数据并将数据加载到表中。

  CREATE TABLE entities_extract(doc_id STRING,name STRING,type STRING,len STRING,offset STRING)
ROW FORMAT DELIMITED '\'
'\\\
'
作为TEXTFILE
存储'/ research / 45924 / hive / entities_extract';

LOAD DATA LOCAL INPATH'/home/researcher/hadoop-runnables/files/entitie_extract_by_doc.txt'覆盖表格实体__extract;

Oke到目前为止非常好,当我执行这个脚本时没有错误。奇怪的是,当我从表上选择*时,我的结果显示了4个额外的空值列



进入的数据如下所示:

  USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4 Chanko PERSON 6 41086 

从select返回的数据如下所示:

  USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4 Chanko PERSON 6 41086 NULL NULL NULL 

编辑:
在entitie_extract_by_doc.txt的一小部分下面

  USER.A -GovDocs -f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4 Berkowitz PERSON 9 385 
USER.A-GovDocs-f83c6ca3-9585-4c66 -b9b0-f4c3bd57ccf4 Marotolli PERSON 939420
USER.A-GovDocs-f83c6ca3- 9585-4c66-b9b0-f4c3bd57ccf4 Corzatt PERSON 7 39772
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4 Berkowitz PERSON 9 40314
USER.A-GovDocs -f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4 Corzatt PERSON 7 40584
USER.A-GovDocs-f83c6ca3-9585-4c66 -b9b0-f4c3bd57ccf4 Berkowitz PERSON 9 40840
USER.A-GovDocs-f83c6ca3-9585 -4c66-b9b0-f4c3bd57ccf4 Rich PERSON 4 41038
USER.A-GovDocs -f83c6ca3-9585-4c66 -b9b0-f4c3bd57ccf4 Lea PERSON 3 41044
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0 -f4c3bd57ccf4 Anthony PERSON 7 41049
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4 Bill PERSON 4 41062
USER.A-GovDocs -f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4 Nelson PERSON 6 41067
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4 Barbara PERSON 7 41078
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4 Chanko PERSON 6 41086

我已经看过我的源数据,看看是否有4个额外的选项卡,但事实并非如此。 。



这里的任何人都知道这4个额外的列来自哪里?



ards,

Martijn

解决方案

所以不需要提及位置。从查询中移除位置,然后获得正确的值。


I am new to Hive so please be gentle if my question is noobies :-)

I use the following hive statement to create and load data into a table.

CREATE TABLE entities_extract (doc_id STRING, name STRING, type STRING, len STRING, offset    STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
LINES TERMINATED BY '\n'
STORED AS TEXTFILE
LOCATION '/research/45924/hive/entities_extract';

LOAD DATA LOCAL INPATH '/home/researcher/hadoop-runnables/files/entitie_extract_by_doc.txt' OVERWRITE INTO TABLE entities_extract;

Oke so far so good, there are no errors when I execute this script. The weird thing is that when I do a select * from on the table my result shows 4 extra columns with null values

The data that goes in looks like below:

USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4   Chanko   PERSON   6   41086

The data that returns form the select looks like this:

USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4   Chanko   PERSON   6   41086   NULL    NULL    NULL    NULL

EDIT: Below a small subset of "entitie_extract_by_doc.txt"

USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4   Berkowitz   PERSON   9   385
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4   Marotolli   PERSON   939420
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4   Corzatt   PERSON   7   39772
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4   Berkowitz   PERSON   9  40314
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4   Corzatt   PERSON   7   40584
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4   Berkowitz   PERSON   9  40840
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4   Rich   PERSON   4   41038
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4   Lea   PERSON   3   41044
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4   Anthony   PERSON   7   41049
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4   Bill   PERSON   4   41062
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4   Nelson   PERSON   6   41067
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4   Barbara   PERSON   7   41078
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4   Chanko   PERSON   6   41086

I already look at my source data to see if there were 4 extra tabs, but that was not the case..

Anyone here has any idea where these 4 extra columns come from?

Kind regards,

Martijn

解决方案

Here you not creating external table , so not need to mention location. Remove location from the query , then you get correct values.

这篇关于从本地存储文件导入数据后,HIVE查询返回空值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆