如何读取hadoop顺序文件? [英] How to read hadoop sequential file?

查看:251
本文介绍了如何读取hadoop顺序文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个顺序文件,它是hadoop map-reduce作业的输出。
在这个文件中,数据是用键值对写的,而值本身就是一个地图。
我想将值作为MAP对象读取,以便我可以进一步处理。

 配置config = new组态(); 
路径路径=新路径(D:\\\OSP\\sample_data\\data\\part-00000);
SequenceFile.Reader reader = new SequenceFile.Reader(FileSystem.get(config),path,config);
WritableComparable key =(WritableComparable)reader.getKeyClass()。newInstance();
可写值=(可写)reader.getValueClass()。newInstance();
long position = reader.getPosition();

while(reader.next(key,value))
{
System.out.println(Key is:+ textKey +value is:+ val + \\\
);
}

程序输出:键是:[这是键]值是:{ abc = 839177,xyz = 548498,lmn = 2,pqr = 1}



这里我以字符串的形式获得价值,但我希望它作为地图的对象。 / p>

解决方案

检查API文档 SequenceFile#下(可写,可写)

  while(reader.next(key,value))
{
System.out.println(Key is:+ textKey +value is:+ val +\\\
);
}

应替换为



$($ key $)$($ key $)$($)$($)$($) +值+ \\\
);
}

使用 SequenceFile.Reader#getValueClassName 以获取SequenceFile中的值类型。 SequenceFile在文件头中具有键/值类型。


I have a sequential file which is the output of hadoop map-reduce job. In this file data is written in key value pairs ,and value itself is a map. I want to read the value as a MAP object so that i can process it further.

    Configuration config = new Configuration();
    Path path = new Path("D:\\OSP\\sample_data\\data\\part-00000");
    SequenceFile.Reader reader = new SequenceFile.Reader(FileSystem.get(config), path, config);
    WritableComparable key = (WritableComparable) reader.getKeyClass().newInstance();
    Writable value = (Writable) reader.getValueClass().newInstance();
    long position = reader.getPosition();

    while(reader.next(key,value))
    {
           System.out.println("Key is: "+textKey +" value is: "+val+"\n");
    }

output of program: Key is: [this is key] value is: {abc=839177, xyz=548498, lmn=2, pqr=1}

Here i am getting value as string ,but i want it as a object of map.

解决方案

Check the API documentation for SequenceFile#next(Writable, Writable)

while(reader.next(key,value))
{
       System.out.println("Key is: "+textKey +" value is: "+val+"\n");
}

should be replaced with

while(reader.next(key,value))
{
       System.out.println("Key is: "+key +" value is: "+value+"\n");
}

Use SequenceFile.Reader#getValueClassName to get the value type in the SequenceFile. SequenceFile have the key/value types in the file header.

这篇关于如何读取hadoop顺序文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆