如果文件具有不同的列数,如何将数据加载到相同的Hive表 [英] How to load data to same Hive table if file has different number of columns

查看:188
本文介绍了如果文件具有不同的列数,如何将数据加载到相同的Hive表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个主表(Employee),其中有10列,我可以使用load data inpath /file1.txt into table Employee

I have a main table (Employee) which is having 10 columns and I can load data into it using load data inpath /file1.txt into table Employee

我的问题是,如果我的文件file2.txt具有相同的列,但缺少第3列和第5列,则如何处理同一张表(雇员).如果我直接加载数据,则最后一列将为NULL NULL.但应该将第3列加载为NULL,将第5列加载为NULL.

My question is how to handle the same table (Employee) if my file file2.txt has same columns but column 3 and columns 5 are missing. if I directly load data last columns will be NULL NULL. but instead it should load 3rd as NULL and 5th column as NULL.

假设我有一个表Employee,我想将file1.txtfile2.txt加载到表中.

Suppose I have a table Employee and I want to load the file1.txt and file2.txt to table.


file1.txt
==========
id name sal deptid state coutry  
1  aaa  1000 01   TS   india  
2  bbb  2000 02   AP   india  
3  ccc  3000 03   BGL   india  


file2.txt  

id  name   deptid country  
1  second   001   US  
2  third    002   ENG  
3  forth    003   AUS  

file2.txt中,我们缺少2列,即salstate.

In file2.txt we are missing 2 columns i.e. sal and state.

我们需要使用相同的Employee表如何处理它?<​​/p>

we need to use the same Employee table how to handle it ?

推荐答案

似乎无法直接加载到指定的列中.

It seems like there is no way to directly load into specified columns.

因此,这可能是您需要做的:

As such, this is what you probably need to do:

  1. 将数据inpath加载到与文件匹配的(临时?)表中
  2. 通过选择上一个表的内容,插入到最终表的相关列中.

情况与此问题非常相似,涵盖了相反的情况(您只想加载几列) .

The situation is very similar to this question which covers the opposite scenario (you only want to load a few columns).

这篇关于如果文件具有不同的列数,如何将数据加载到相同的Hive表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆