Hadoop Map Reduce读取文本文件 [英] Hadoop Map Reduce read a text file

查看:109
本文介绍了Hadoop Map Reduce读取文本文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图编写一个MapReduce程序,它可以读取输入文件并将输出写入另一个文本文件。我打算为此使用BufferedReader类。但是我真的不知道如何在MapReduce程序中使用它。



有人可以给我一个代码片段吗?

PS我完全不熟悉Hadoop和MapReduce编程。所以请耐心等待。



预先感谢您。

解决方案

下面的代码可以帮助您从HDFS读取文件并在控制台中显示内容。
import java.io.InputStreamReader;

导入org.apache.hadoop.conf.Configuration;
导入org.apache.hadoop.fs.FileSystem;
导入org.apache.hadoop.fs.Path;

public class Cat {
public static void main(String [] args)throws Exception {
try {
Path pt = new Path(hdfs:/ path / to / file); // HDFS中文件的位置
FileSystem fs = FileSystem.get(new Configuration());
BufferedReader br = new BufferedReader(new InputStreamReader(fs.open(pt)));
字符串行;
line = br.readLine();
while(line!= null){
System.out.println(line);
line = br.readLine();

catch(例外e){
}
}
}

编辑



驱动程式

<$ p $ (

$ b public static void main(String [] args)throws Exception {
Configuration conf = new Configuration();
作业作业=新作业(conf,读取文件);


FileSystem fs = FileSystem.get(conf);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
if(fs.exists(new Path(args [1])))
fs.delete(new Path(args [1]),true);
job.setMapperClass(Map.class);
job.setReducerClass(Reduce.class);

job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);

FileInputFormat.addInputPath(job,new Path(args [0]));
FileOutputFormat.setOutputPath(job,new Path(args [1]));
job.setJarByClass(ReadFile.class);
job.waitForCompletion(true);
}

}

Mapper

  public class Map扩展Mapper< LongWritable,Text,Text,IntWritable> {

public void setup(Context context)throws IOException {
Path pt = new Path(hdfs:/ path / to / file); // HDFS中文件的位置
FileSystem fs = FileSystem.get(new Configuration());
BufferedReader br = new BufferedReader(new InputStreamReader(fs.open(pt)));
字符串行;
line = br.readLine();
while(line!= null){
System.out.println(line);
line = br.readLine();
}

public void map(LongWritable key,Text value,Context context)
抛出IOException,InterruptedException {
//根据您的愿望
}


以上代码可帮助您从HDFS读取文本文件。


I'm trying to write a MapReduce program that can read an input file and write the output to another text file. I'm planning to use the BufferedReader class for this. But I don't really know how to use it in a MapReduce program.

Can someone give me a code snippet of it?

P.S. I'm totally new to Hadoop and MapReduce programming. So please bear with me.

Thank you in advance.

解决方案

Below code helps you to read a file from HDFS and display the content in console

import java.io.BufferedReader;
import java.io.InputStreamReader;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;

public class Cat{
    public static void main (String [] args) throws Exception{
        try{
            Path pt=new Path("hdfs:/path/to/file");//Location of file in HDFS
            FileSystem fs = FileSystem.get(new Configuration());
            BufferedReader br=new BufferedReader(new InputStreamReader(fs.open(pt)));
            String line;
            line=br.readLine();
            while (line != null){
                System.out.println(line);
                line=br.readLine();
            }
        }catch(Exception e){
        }
    }
}

EDIT

Driver

public class ReadFile {

    public static void main(String[] args) throws Exception {
        Configuration conf = new Configuration();
        Job job = new Job(conf, "Read a File");


        FileSystem fs = FileSystem.get(conf);
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(IntWritable.class);
        if (fs.exists(new Path(args[1])))
            fs.delete(new Path(args[1]), true);
        job.setMapperClass(Map.class);
        job.setReducerClass(Reduce.class);

        job.setInputFormatClass(TextInputFormat.class);
        job.setOutputFormatClass(TextOutputFormat.class);

        FileInputFormat.addInputPath(job, new Path(args[0]));
        FileOutputFormat.setOutputPath(job, new Path(args[1]));
        job.setJarByClass(ReadFile.class);     
        job.waitForCompletion(true);
    }

}

Mapper

public class Map extends Mapper<LongWritable, Text, Text, IntWritable> {

    public void setup(Context context) throws IOException{
        Path pt=new Path("hdfs:/path/to/file");//Location of file in HDFS
        FileSystem fs = FileSystem.get(new Configuration());
        BufferedReader br=new BufferedReader(new InputStreamReader(fs.open(pt)));
        String line;
        line=br.readLine();
        while (line != null){
            System.out.println(line);
            line=br.readLine();
        }
    }
    public void map(LongWritable key, Text value, Context context)
            throws IOException, InterruptedException {
      //as your wish
        }
    }
}

Above code helps you to read a text file from HDFS.

这篇关于Hadoop Map Reduce读取文本文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆