Spark正在发明自己的AWS SecretKey [英] Spark is inventing his own AWS secretKey

查看：116 发布时间：2020/7/16 18:47:32 amazon-web-services apache-spark amazon-s3 http-status-code-403 access-keys

本文介绍了Spark正在发明自己的AWS SecretKey的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试从Spark读取s3存储桶，直到今天，Spark始终抱怨该请求返回403

hadoopConf = spark_context._jsc.hadoopConfiguration()
hadoopConf.set("fs.s3a.access.key", "ACCESSKEY")
hadoopConf.set("fs.s3a.secret.key", "SECRETKEY")
hadoopConf.set("fs.s3a.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem")
logs = spark_context.textFile("s3a://mybucket/logs/*)

Spark在说....无效的访问密钥[ACCESSKEY]

但是，使用相同的ACCESSKEY和SECRETKEY可以与aws-cli一起使用

aws s3 ls mybucket/logs/

并且在python boto3中这是有效的

resource = boto3.resource("s3", region_name="us-east-1")
resource.Object("mybucket", "logs/text.py") \
            .put(Body=open("text.py", "rb"),ContentType="text/x-py")

所以我的凭证无效，并且肯定是Spark的问题.

今天我决定打开整个火花的"DEBUG"日志，但令我惊讶的是... Spark没有使用我提供的[SECRETKEY]，而是...添加了一个随机的????

17/03/08 10:40:04调试请求:发送请求:HEAD https://mybucket.s3 .amazonaws.com /标头:(授权:AWS ACCESSKEY: [RANDON-SECRET-KEY] ，用户代理:aws-sdk-java/1.7.4 Mac_OS_X/10.11.6 Java_HotSpot(TM)_64-Bit_Server_VM/25.65-b01/1.8.0_65，日期:2017年3月8日星期三10:40:04 GMT，内容类型:application/x-www-form-urlencoded; charset = utf-8， )

这就是为什么它仍然返回403的原因！ Spark没有使用我提供给fs.s3a.secret.key的密钥，而是发明了一个随机的密钥??

为了记录，我正在使用此命令在本机(OSX)上本地运行

spark-submit --packages com.amazonaws:aws-java-sdk-pom:1.11.98,org.apache.hadoop:hadoop-aws:2.7.3 test.py

有人可以启发我吗?

解决方案

我遇到了类似的问题.使用有效AWS凭证的请求返回403禁止，但仅在某些计算机上.最终，我发现那些特定机器上的系统时间落后了10分钟.同步系统时钟解决了这个问题.

希望这会有所帮助！

I'm trying to read a s3 bucket from Spark and up until today Spark always complain that the request return 403

hadoopConf = spark_context._jsc.hadoopConfiguration()
hadoopConf.set("fs.s3a.access.key", "ACCESSKEY")
hadoopConf.set("fs.s3a.secret.key", "SECRETKEY")
hadoopConf.set("fs.s3a.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem")
logs = spark_context.textFile("s3a://mybucket/logs/*)

Spark was saying .... Invalid Access key [ACCESSKEY]

However with the same ACCESSKEY and SECRETKEY this was working with aws-cli

aws s3 ls mybucket/logs/

and in python boto3 this was working

resource = boto3.resource("s3", region_name="us-east-1")
resource.Object("mybucket", "logs/text.py") \
            .put(Body=open("text.py", "rb"),ContentType="text/x-py")

so my credentials ARE invalid and the problem is definitely something with Spark..

Today I decided to turn on the "DEBUG" log for the entire spark and to my suprise... Spark is NOT using the [SECRETKEY] I have provided but instead... add a random one???

17/03/08 10:40:04 DEBUG request: Sending Request: HEAD https://mybucket.s3.amazonaws.com / Headers: (Authorization: AWS ACCESSKEY:[RANDON-SECRET-KEY], User-Agent: aws-sdk-java/1.7.4 Mac_OS_X/10.11.6 Java_HotSpot(TM)_64-Bit_Server_VM/25.65-b01/1.8.0_65, Date: Wed, 08 Mar 2017 10:40:04 GMT, Content-Type: application/x-www-form-urlencoded; charset=utf-8, )

This is why it still return 403! Spark is not using the key I provide with fs.s3a.secret.key but instead invent a random one??

For the record I'm running this locally on my machine (OSX) with this command

spark-submit --packages com.amazonaws:aws-java-sdk-pom:1.11.98,org.apache.hadoop:hadoop-aws:2.7.3 test.py

Could some one enlighten me on this?

解决方案

I ran into a similar issue. Requests that were using valid AWS credentials returned a 403 Forbidden, but only on certain machines. Eventually I found out that the system time on those particular machines were 10 minutes behind. Synchronizing the system clock solved the problem.

Hope this helps!

这篇关于Spark正在发明自己的AWS SecretKey的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Spark正在发明自己的AWS SecretKey [英] Spark is inventing his own AWS secretKey

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Spark正在发明自己的AWS SecretKey [英] Spark is inventing his own AWS secretKey

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭