从本地存储文件导入数据后,HIVE查询返回空值 [英] HIVE Query returning null values after import data from local stored file
问题描述
我是Hive的新手,所以如果我的问题是noobies,请保持温柔: - )
我使用以下hive语句创建数据并将数据加载到表中。
CREATE TABLE entities_extract(doc_id STRING,name STRING,type STRING,len STRING,offset STRING)
ROW FORMAT DELIMITED '\'
'\\\
'
作为TEXTFILE
存储'/ research / 45924 / hive / entities_extract';
LOAD DATA LOCAL INPATH'/home/researcher/hadoop-runnables/files/entitie_extract_by_doc.txt'覆盖表格实体__extract;
Oke到目前为止非常好,当我执行这个脚本时没有错误。奇怪的是,当我从表上选择*时,我的结果显示了4个额外的空值列
进入的数据如下所示:
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4 Chanko PERSON 6 41086
从select返回的数据如下所示:
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4 Chanko PERSON 6 41086 NULL NULL NULL
编辑:
在entitie_extract_by_doc.txt的一小部分下面
USER.A -GovDocs -f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4 Berkowitz PERSON 9 385
USER.A-GovDocs-f83c6ca3-9585-4c66 -b9b0-f4c3bd57ccf4 Marotolli PERSON 939420
USER.A-GovDocs-f83c6ca3- 9585-4c66-b9b0-f4c3bd57ccf4 Corzatt PERSON 7 39772
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4 Berkowitz PERSON 9 40314
USER.A-GovDocs -f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4 Corzatt PERSON 7 40584
USER.A-GovDocs-f83c6ca3-9585-4c66 -b9b0-f4c3bd57ccf4 Berkowitz PERSON 9 40840
USER.A-GovDocs-f83c6ca3-9585 -4c66-b9b0-f4c3bd57ccf4 Rich PERSON 4 41038
USER.A-GovDocs -f83c6ca3-9585-4c66 -b9b0-f4c3bd57ccf4 Lea PERSON 3 41044
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0 -f4c3bd57ccf4 Anthony PERSON 7 41049
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4 Bill PERSON 4 41062
USER.A-GovDocs -f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4 Nelson PERSON 6 41067
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4 Barbara PERSON 7 41078
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4 Chanko PERSON 6 41086
我已经看过我的源数据,看看是否有4个额外的选项卡,但事实并非如此。 。
这里的任何人都知道这4个额外的列来自哪里?
ards,
Martijn
所以不需要提及位置。从查询中移除位置,然后获得正确的值。
I am new to Hive so please be gentle if my question is noobies :-)
I use the following hive statement to create and load data into a table.
CREATE TABLE entities_extract (doc_id STRING, name STRING, type STRING, len STRING, offset STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
LINES TERMINATED BY '\n'
STORED AS TEXTFILE
LOCATION '/research/45924/hive/entities_extract';
LOAD DATA LOCAL INPATH '/home/researcher/hadoop-runnables/files/entitie_extract_by_doc.txt' OVERWRITE INTO TABLE entities_extract;
Oke so far so good, there are no errors when I execute this script. The weird thing is that when I do a select * from on the table my result shows 4 extra columns with null values
The data that goes in looks like below:
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4 Chanko PERSON 6 41086
The data that returns form the select looks like this:
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4 Chanko PERSON 6 41086 NULL NULL NULL NULL
EDIT: Below a small subset of "entitie_extract_by_doc.txt"
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4 Berkowitz PERSON 9 385
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4 Marotolli PERSON 939420
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4 Corzatt PERSON 7 39772
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4 Berkowitz PERSON 9 40314
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4 Corzatt PERSON 7 40584
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4 Berkowitz PERSON 9 40840
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4 Rich PERSON 4 41038
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4 Lea PERSON 3 41044
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4 Anthony PERSON 7 41049
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4 Bill PERSON 4 41062
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4 Nelson PERSON 6 41067
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4 Barbara PERSON 7 41078
USER.A-GovDocs-f83c6ca3-9585-4c66-b9b0-f4c3bd57ccf4 Chanko PERSON 6 41086
I already look at my source data to see if there were 4 extra tabs, but that was not the case..
Anyone here has any idea where these 4 extra columns come from?
Kind regards,
Martijn
Here you not creating external table , so not need to mention location. Remove location from the query , then you get correct values.
这篇关于从本地存储文件导入数据后,HIVE查询返回空值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!