如果文件具有不同的列数,如何将数据加载到相同的Hive表 [英] How to load data to same Hive table if file has different number of columns
问题描述
我有一个主表(Employee),其中有10列,我可以使用load data inpath /file1.txt into table Employee
I have a main table (Employee) which is having 10 columns and I can load data into it using load data inpath /file1.txt into table Employee
我的问题是,如果我的文件file2.txt具有相同的列,但缺少第3列和第5列,则如何处理同一张表(雇员).如果我直接加载数据,则最后一列将为NULL
NULL
.但应该将第3列加载为NULL,将第5列加载为NULL.
My question is how to handle the same table (Employee) if my file file2.txt has same columns but column 3 and columns 5 are missing. if I directly load data last columns will be NULL
NULL
. but instead it should load 3rd as NULL and 5th column as NULL.
假设我有一个表Employee,我想将file1.txt
和file2.txt
加载到表中.
Suppose I have a table Employee and I want to load the file1.txt
and file2.txt
to table.
file1.txt
==========
id name sal deptid state coutry
1 aaa 1000 01 TS india
2 bbb 2000 02 AP india
3 ccc 3000 03 BGL india
file2.txt
id name deptid country
1 second 001 US
2 third 002 ENG
3 forth 003 AUS
在file2.txt
中,我们缺少2列,即sal
和state
.
In file2.txt
we are missing 2 columns i.e. sal
and state
.
我们需要使用相同的Employee表如何处理它?</p>
we need to use the same Employee table how to handle it ?
推荐答案
似乎无法直接加载到指定的列中.
It seems like there is no way to directly load into specified columns.
因此,这可能是您需要做的:
As such, this is what you probably need to do:
- 将数据inpath加载到与文件匹配的(临时?)表中
- 通过选择上一个表的内容,插入到最终表的相关列中.
情况与此问题非常相似,涵盖了相反的情况(您只想加载几列) .
The situation is very similar to this question which covers the opposite scenario (you only want to load a few columns).
这篇关于如果文件具有不同的列数,如何将数据加载到相同的Hive表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!