如何从Hdfs读取CSV文件? [英] How to read a CSV file from Hdfs?
本文介绍了如何从Hdfs读取CSV文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个CSV文件中的数据。我想读取HDFS中的CSV文件。
I have my Data in a CSV file. I want to read the CSV file which is in HDFS.
任何人都可以帮助我的代码
Can anyone help me with the code??
我是hadoop的新人。
I'm new to hadoop. Thanks in Advance.
推荐答案
这需要的类是 FileSystem , FSDataInputStream 和路径。客户端应该是这样:
The classes required for this are FileSystem, FSDataInputStream and Path. Client should be something like this :
public static void main(String[] args) throws IOException {
// TODO Auto-generated method stub
Configuration conf = new Configuration();
conf.addResource(new Path("/hadoop/projects/hadoop-1.0.4/conf/core-site.xml"));
conf.addResource(new Path("/hadoop/projects/hadoop-1.0.4/conf/hdfs-site.xml"));
FileSystem fs = FileSystem.get(conf);
FSDataInputStream inputStream = fs.open(new Path("/path/to/input/file"));
System.out.println(inputStream.readChar());
}
FSDataInputStream有多个读取
方法。
FSDataInputStream has several read
methods. Choose the one which suits your needs.
如果是MR,更容易:
public static class YourMapper extends
Mapper<LongWritable, Text, Your_Wish, Your_Wish> {
public void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException {
//Framework does the reading for you...
String line = value.toString(); //line contains one line of your csv file.
//do your processing here
....................
....................
context.write(Your_Wish, Your_Wish);
}
}
}
这篇关于如何从Hdfs读取CSV文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文