Amazon s3a 使用 Spark 返回 400 个错误请求 [英] Amazon s3a returns 400 Bad Request with Spark

查看:53
本文介绍了Amazon s3a 使用 Spark 返回 400 个错误请求的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

出于结帐目的,我尝试将 Amazon S3 存储桶设置为检查点文件.

For checkout purpose I try to set up an Amazon S3 bucket as checkpoint file.

val checkpointDir = "s3a://bucket-name/checkpoint.txt"
val sc = new SparkContext(conf)
sc.setLocalProperty("spark.default.parallelism", "30")
sc.hadoopConfiguration.set("fs.s3a.access.key", "xxxxx")
sc.hadoopConfiguration.set("fs.s3a.secret.key", "xxxxx")
sc.hadoopConfiguration.set("fs.s3a.endpoint", "bucket-name.s3-website.eu-central-1.amazonaws.com")
val ssc = new StreamingContext(sc, Seconds(10))
ssc.checkpoint(checkpointDir)

但它会因为这个异常而停止

but it stops with this exception

Exception in thread "main" com.amazonaws.services.s3.model.AmazonS3Exception: Status Code: 400, AWS Service: Amazon S3, AWS Request ID: 9D8E8002H3BBDDC7, AWS Error Code: null, AWS Error Message: Bad Request, S3 Extended Request ID: Qme5E3KAr/KX0djiq9poGXPJkmr0vuXAduZujwGlvaAl+oc6vlUpq7LIh70IF3LNgoewjP+HnXA=
at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:798)
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:421)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:232)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3528)
at com.amazonaws.services.s3.AmazonS3Client.headBucket(AmazonS3Client.java:1031)
at com.amazonaws.services.s3.AmazonS3Client.doesBucketExist(AmazonS3Client.java:994)
at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:154)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2596)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2612)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
at org.apache.spark.streaming.StreamingContext.checkpoint(StreamingContext.scala:232)
at com.misterbell.shiva.StreamingApp$.main(StreamingApp.scala:89)
at com.misterbell.shiva.StreamingApp.main(StreamingApp.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:664)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:169)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:192)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:111)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

我不明白为什么会出现此错误,而且我找不到任何示例.

I don't understand why I got this error and I can't find any example.

推荐答案

此消息对应于类似bad endpoint"的内容或者签名版本支持不好.

This message correspond to something like "bad endpoint" or bad signature version support.

喜欢的这里法兰克福是唯一不支持签名版本 2 的.这是我选择的.

like seen here frankfurt is the only one that not support signature version 2. And it's the one I picked.

当然毕竟我的研究也说不出什么是签名版,文档里也看不出来.但 V2 似乎适用于 s3a.

Of course after all my reserch can't say what is signature version, it's not obvious in the documentation. But the V2 seems to work with s3a.

在 S3 界面中看到的端点不是真正的端点,它只是 Web 端点.

The endpoint seen in the S3 interface is not the real endpoint it's just the web endpoint.

您必须使用 theses 之一a> 这样的端点sc.hadoopConfiguration.set("fs.s3a.endpoint", "s3.eu-west-1.amazonaws.com")

但它默认使用美国端点

这篇关于Amazon s3a 使用 Spark 返回 400 个错误请求的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆