如何从Hdfs读取CSV文件? [英] How to read a CSV file from Hdfs?

查看:1206
本文介绍了如何从Hdfs读取CSV文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个CSV文件中的数据。我想读取HDFS中的CSV文件。

I have my Data in a CSV file. I want to read the CSV file which is in HDFS.

任何人都可以帮助我的代码

Can anyone help me with the code??

我是hadoop的新人。

I'm new to hadoop. Thanks in Advance.

推荐答案

这需要的类是 FileSystem FSDataInputStream 路径。客户端应该是这样:

The classes required for this are FileSystem, FSDataInputStream and Path. Client should be something like this :

public static void main(String[] args) throws IOException {
        // TODO Auto-generated method stub

        Configuration conf = new Configuration();
        conf.addResource(new Path("/hadoop/projects/hadoop-1.0.4/conf/core-site.xml"));
        conf.addResource(new Path("/hadoop/projects/hadoop-1.0.4/conf/hdfs-site.xml"));
        FileSystem fs = FileSystem.get(conf);
        FSDataInputStream inputStream = fs.open(new Path("/path/to/input/file"));
        System.out.println(inputStream.readChar());         
    }

FSDataInputStream有多个读取方法。

FSDataInputStream has several read methods. Choose the one which suits your needs.

如果是MR,更容易:

        public static class YourMapper extends
                    Mapper<LongWritable, Text, Your_Wish, Your_Wish> {

                public void map(LongWritable key, Text value, Context context)
                        throws IOException, InterruptedException {

                    //Framework does the reading for you...
                    String line = value.toString();      //line contains one line of your csv file.
                    //do your processing here
                    ....................
                    ....................
                    context.write(Your_Wish, Your_Wish);
                    }
                }
            }

这篇关于如何从Hdfs读取CSV文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆