Amazon Web服务EMR文件系统 [英] Amazon Web Service EMR FileSystem

查看：120 发布时间：2020/8/23 2:36:54 java hadoop amazon-web-services amazon-s3 elastic-map-reduce

本文介绍了Amazon Web服务EMR文件系统的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试在AWS EMR集群上运行作业.我遇到的问题如下:

I am trying to run a job on an AWS EMR cluster. The problem Im getting is the following:

aws java.io.IOException:方案:hdfs没有文件系统

aws java.io.IOException: No FileSystem for scheme: hdfs

我不知道我的问题到底在哪里(在我的Java jar作业中或在作业的配置中)

I dont know where exactly my problem resides (in my java jar job or in the configurations of the job)

在我的S3存储桶中，我创建了一个文件夹(输入)，并在其中将一堆文件与我的数据一起放入.然后在参数Im中给出输入文件夹的路径，然后将相同的路径用作FileInputPath.getInputPath(args [0]).

In my S3 bucket Im making a folder (input) and in it im putting a bunch of files with my data. Then in the arguments Im giving the path for the input folder which then same path is used as the FileInputPath.getInputPath(args[0]).

我的问题是-首先，作业将获取输入文件夹中的所有文件并对其进行全部处理，还是我必须提供每个文件的所有路径?

My question is - First will the job grab all files in the input folder and work on them all or I have to supply all of the paths of each file?

第二个问题-如何解决上述异常?

Second question - How can I solve the above Exception?

谢谢

推荐答案

将输入文件保留在S3中.例如s3://mybucket/input/ 将所有文件保存在我的存储桶下的输入文件夹中.

Keep your input files in S3 . e.g. s3://mybucket/input/ Keep all your file to be pressed in input folder under my bucket.

在您的地图中按如下所示减少使用代码

In you map reduce use code as below

FileInputFormat.addInputPath(job,"s3n://mybucket/input/")

这将自动处理输入文件夹下的所有文件.

This will automatically process all files under input folder.

这篇关于Amazon Web服务EMR文件系统的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Amazon Web服务EMR文件系统 [英] Amazon Web Service EMR FileSystem

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

Amazon Web服务EMR文件系统 [英] Amazon Web Service EMR FileSystem

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭