Hive-从HDFS中的三个CSV文件的特定数据创建Hive表 [英] Hive - create hive table from specific data of three csv files in hdfs

查看：886 发布时间：2020/11/22 19:27:42 hive hdfs hiveql hive-table

本文介绍了Hive-从HDFS中的三个CSV文件的特定数据创建Hive表的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有三个.csv文件，每个文件位于不同的hdfs目录中.我现在想用这三个文件中的数据制作一个Hive内部表.我想要第一个文件中的四列，第二个文件中的三列，第三个文件中的两列.第一个文件与第二个文件共享唯一的id列，第三个文件与第三个文件共享另一个唯一的id列.这两个唯一的ID都存在于第二个文件中；使用这些ID，我想从左到外加入表格.

I have three .csv files, each in different hdfs directory. I now want to make a Hive internal table with data from those three files. I want four columns from first file, three columns from second file and two columns from third file. first file share an unique id column with second file and third file share another unique id column with third file. both unique ids are present in second file; using these ids I would like to left-outer-join to make table.

文件1:"/directory_1/sub_directory_1/table1_data_on_01_01_2014.csv"
文件2:"/directory_2/sub_directory_2/table2_data_on_01_01_2014.csv"
文件3:"/directory_3/sub_directory_3/table3_data_on_01_01_2014.csv"

file 1: '/directory_1/sub_directory_1/table1_data_on_01_01_2014.csv'
file 2: '/directory_2/sub_directory_2/table2_data_on_01_01_2014.csv'
file 3: '/directory_3/sub_directory_3/table3_data_on_01_01_2014.csv'

文件1的内容

unique_id_1,age,department,reason_of_visit,--more columns--,,,
id_11,entry_12,entry_13,entry_14,--more entries--
id_12,entry_22,entry_23,entry_24,--more entries--
id_13,entry_32,entry_33,entry_34,--more entries--

文件2的内容:

unique_id_1,date_of_transaction,transaction_fee,unique_id_2--more columns--,,,
id_11,entry_121,entry_131,id_21,--more entries--
id_12,entry_221,entry_231,id_22,--more entries--
id_13,entry_321,entry_331,id_23,--more entries--

文件3的内容:

unique_id_2,diagnosis,gender --more columns--,,,
id_21,entry_141,entry_151,--more entries--
id_22,entry_241,entry_151,--more entries--
id_23,entry_341,entry_151,--more entries--

我现在想制作一个内部表，像这样:

I now want to make an internal table like this:

unique_id_1 age department reason_of_visit date_of_transaction unique_id_2 transaction_fee diagnosis gender
id_11 entry_12 entry_13 entry_14 entry_121 entry_131 id_21 entry_141 entry_151
id_12 entry_22 entry_23 entry_24 entry_221 entry_231 id_22 entry_241 entry_251
id_13 entry_32 entry_33 entry_34 entry_321 entry_331 id_23 entry_341 entry_251

我如何做到这一点?

Hive-从HDFS中的三个CSV文件的特定数据创建Hive表 [英] Hive - create hive table from specific data of three csv files in hdfs

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Hive-从HDFS中的三个CSV文件的特定数据创建Hive表 [英] Hive - create hive table from specific data of three csv files in hdfs

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭