使用FileFormat v Serde读取自定义文本文件 [英] Using FileFormat v Serde to read custom text files

查看：150 发布时间：2018/6/12 13:49:24 hive

本文介绍了使用FileFormat v Serde读取自定义文本文件的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

Hadoop / Hive新手在这里。我正在尝试使用Hive以自定义文本格式存储的数据。我的理解是你可以写一个自定义的 FileFormat 或一个自定义的 SerDe 类来做到这一点。是这种情况还是我误解了它？什么时候选择哪个选项的一般指导原则是什么？谢谢！

解决方案

我想通了。我没有必要写一个serde，写了一个自定义的InputFormat（扩展 org.apache.hadoop.mapred.TextInputFormat ），它返回一个自定义的RecordReader（implements org.apache.hadoop.mapred.RecordReader< K，V> ）。 RecordReader实现逻辑来读取和解析我的文件，并返回制表符分隔的行。

这样，我将表格声明为

  create table t2（
 field1 string，
 .. 
 fieldNN float）
 ROW FORMAT DELIMITED 
 FIELDS TERMINATED BY '\t'
存储为INPUTFORMAT'namespace.CustomFileInputFormat'
 OUTPUTFORMAT'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat';

这使用本地SerDe。此外，使用自定义输入格式时需要指定输出格式，所以我选择了其中一种内置输出格式。

Hadoop/Hive newbie here. I am trying to use data stored in a custom text-based format with Hive. My understanding is you can either write a custom FileFormat or a custom SerDe class to do that. Is that the case or am I misunderstanding it? And what are some general guidelines on which option to choose when? Thanks!

解决方案

I figured it out. I did not have to write a serde after all, wrote a custom InputFormat (extends org.apache.hadoop.mapred.TextInputFormat) which returns a custom RecordReader (implements org.apache.hadoop.mapred.RecordReader<K, V>). The RecordReader implements logic to read and parse my files and returns tab delimited rows.

With that I declared my table as

create table t2 ( 
field1 string, 
..
fieldNN float)        
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'    
STORED AS INPUTFORMAT 'namespace.CustomFileInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat';

This uses a native SerDe. Also, it is required to specify an output format when using a custom input format, so I choose one of the built-in output formats.

这篇关于使用FileFormat v Serde读取自定义文本文件的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用FileFormat v Serde读取自定义文本文件 [英] Using FileFormat v Serde to read custom text files

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

使用FileFormat v Serde读取自定义文本文件 [英] Using FileFormat v Serde to read custom text files

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭