从S3或本地文件系统火花子目录递归地读取文件 [英] read files recursively from sub directories with spark from s3 or local filesystem

查看：259 发布时间：2016/5/22 15:35:29 scala hadoop apache-spark

本文介绍了从S3或本地文件系统火花子目录递归地读取文件的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想从其中包含许多子目录的目录中读取文件。数据处于S3，我试图做到这一点：

I am trying to read files from a directory which contains many sub directories. The data is in S3 and I am trying to do this:

VAL RDD = sc.newAPIHadoopFile（data_loc， classOf [org.apache.hadoop.ma preduce.lib.input.TextInputFormat] classOf [org.apache.hadoop.ma preduce.lib.input.TextInputFormat] classOf [org.apache.hadoop.io.NullWritable]）

这似乎并没有工作。

鸭preciate帮助

Appreciate the help

从S3或本地文件系统火花子目录递归地读取文件 [英] read files recursively from sub directories with spark from s3 or local filesystem

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

从S3或本地文件系统火花子目录递归地读取文件 [英] read files recursively from sub directories with spark from s3 or local filesystem

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭