IllegalArgumentException，在指定s3而不是hdfs的输入/输出时错误的FS [英] IllegalArgumentException, Wrong FS when specifying input/output from s3 instead of hdfs

查看：280 发布时间：2018/6/6 11:14:14 amazon-web-services amazon-s3 filesystems hdfs

本文介绍了IllegalArgumentException，在指定s3而不是hdfs的输入/输出时错误的FS的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我一直在一个本地集群上运行我的Spark作业，这个集群有一个hdfs，从那里读取输入并且输出也被写入。现在我已经建立了一个AWS EMR和一个S3存储桶，我有我的输入，我也希望我的输出也写入S3。

错误：

用户类引发异常：java.lang.IllegalArgumentException：错误
FS：s3：// something / input，expected：
hdfs：//ip-some-numbers.eu-west-1.compute.internal：8020

我试过寻找同样的问题，并有关于这个问题的几个问题。有些人建议它仅用于输出，但即使禁用输出，我也会得到相同的错误。另外一个建议是，我的代码中的FileSystem 。以下是我的程序中输入/输出的所有内容：

第一次出现在我的自定义 FileInputFormat ，在 getSplits（JobContext job）中，我没有真正修改过自己，但我可以：

  FileSystem fs = path.getFileSystem（job.getConfiguration（））;

在我的自定义 RecordReader 中也有类似的情况，没有修改自己：

pre $ final FileSystem fs = file.getFileSystem（job）;

在 nextKeyValue（） code> RecordReader 这是我自己编写的：

  FileSystem fs = FileSystem。得到（JC）;

最后，当我想要检测我使用的文件夹中的文件数时：

  val fs = FileSystem.get（sc.hadoopConfiguration）
 val status = fs.listStatus（new Path（path））

我假设问题出在我的代码上，但是如何修改 FileSystem 调用来支持S3的输入/输出？

解决方案

hadoop文件系统apis不支持S3盒子外面。 S3的hadoop文件系统apis有两种实现：S3A和S3N。 S3A似乎是首选的实现。要使用它，你必须做一些事情：

将aws-java-sdk-bundle.jar添加到你的类路径中。

在FileSystem配置中为以下属性创建FileSystem包含值时：

fs.s3a.access.key fs.s3a.secret.key

当在S3上指定路径不要使用 s3：// 使用 s3a：// 来代替。 b $ b

注意：请先创建一个简单用户并尝试使用基本身份验证。可以让它与AWS的更高级的临时凭证机制一起工作，但这有点牵扯，我不得不对FileSystem代码进行一些更改，以便在尝试时使其工作。

信息来源是这里

I have been running my Spark job on a local cluster which has hdfs from where the input is read and the output is written too. Now I have set up an AWS EMR and an S3 bucket where I have my input and I want my output to be written to S3 too.

The error:

User class threw exception: java.lang.IllegalArgumentException: Wrong FS: s3://something/input, expected: hdfs://ip-some-numbers.eu-west-1.compute.internal:8020

I tried searching for the same issue and there are several questions regarding this issue. Some suggested that it's only for the output, but even when I disable output I get the same error.

Another suggestion is that there is something wrong with FileSystem in my code. Here are all of the occurances of input/output in my program:

The first occurance is in my custom FileInputFormat, in getSplits(JobContext job) which I have not actually modified myself but I can:
FileSystem fs = path.getFileSystem(job.getConfiguration());
Similar case in my custom RecordReader, also have not modified myself:
final FileSystem fs = file.getFileSystem(job);
In nextKeyValue() of my custom RecordReader which I have written myself I use:
FileSystem fs = FileSystem.get(jc);
And finally when I want to detect the number of files in a folder I use:
val fs = FileSystem.get(sc.hadoopConfiguration) val status = fs.listStatus(new Path(path))
I assume the issue is with my code, but how can I modify the FileSystem calls to support input/output from S3?
解决方案
The hadoop filesystem apis do not provide support for S3 out of the box. There are two implementations of the hadoop filesystem apis for S3: S3A, and S3N. S3A seems to be the preferred implementation. To use it you have to do a few things:

Add the aws-java-sdk-bundle.jar to your classpath.

When you create the FileSystem include values for the following properties in the FileSystem's configuration:
fs.s3a.access.key fs.s3a.secret.key

When specify paths on S3 don't use s3:// use s3a:// instead.

Note: create a simple user and try things out with basic authentication first. It is possible to get it to work with AWS's more advanced temporary credential mechanisms, but it's a bit involved and I had to make some changes to the FileSystem code in order to get it to work when I tried.

Source of info is here

这篇关于IllegalArgumentException，在指定s3而不是hdfs的输入/输出时错误的FS的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

IllegalArgumentException，在指定s3而不是hdfs的输入/输出时错误的FS [英] IllegalArgumentException, Wrong FS when specifying input/output from s3 instead of hdfs

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

IllegalArgumentException，在指定s3而不是hdfs的输入/输出时错误的FS [英] IllegalArgumentException, Wrong FS when specifying input/output from s3 instead of hdfs

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭