序列文件在Hadoop中 [英] Sequence Files in Hadoop

查看:157
本文介绍了序列文件在Hadoop中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何生成这些序列文件?我看到一个关于序列文件链接,在这里,

How are these sequence files generated ? I saw a link about sequence file here,

http://wiki.apache.org/hadoop/SequenceFile

这些是使用默认的Java串行写的?和我如何读取序列文件?

Are these written using default Java serializer ? and How do I read a sequence file ?

推荐答案

序列文件由马preduce任务生成,并可以作为马preduce就业之间传输数据的通用格式。

Sequence files are generated by MapReduce tasks and and can be used as common format to transfer data between MapReduce jobs.

您可以通过以下方式阅读:

You can read them in the following manner:

Configuration config = new Configuration();
Path path = new Path(PATH_TO_YOUR_FILE);
SequenceFile.Reader reader = new SequenceFile.Reader(FileSystem.get(config), path, config);
WritableComparable key = (WritableComparable) reader.getKeyClass().newInstance();
Writable value = (Writable) reader.getValueClass().newInstance();
while (reader.next(key, value))
  // perform some operating
reader.close();

您也可以使用SequenceFile.Writer自行生成序列文件。

Also you can generate sequence files by yourself using SequenceFile.Writer.

这篇关于序列文件在Hadoop中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆