无法使用 spark 从 s3 存储桶中读取数据 [英] Unable to read from s3 bucket using spark

查看：36 发布时间：2021/11/14 22:09:38 scala amazon-web-services apache-spark amazon-s3 apache-spark-sql

本文介绍了无法使用 spark 从 s3 存储桶中读取数据的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

val spark = SparkSession
        .builder()
        .appName("try1")
        .master("local")
        .getOrCreate()

val df = spark.read
        .json("s3n://BUCKET-NAME/FOLDER/FILE.json")
        .select($"uid").show(5)

我已将 AWS_ACCESS_KEY_ID、AWS_SECRET_ACCESS_KEY 作为环境变量.尝试从 S3 读取时遇到以下错误.

I have given the AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY as environment variables. I face below error while trying to read from S3.

Exception in thread "main" org.apache.hadoop.fs.s3.S3Exception: org.jets3t.service.S3ServiceException: S3 HEAD request failed for '/FOLDER%2FFILE.json' - ResponseCode=400, ResponseMessage=Bad Request

我怀疑该错误是由于某些内部函数将/"转换为%2F"造成的，因为错误显示的是/FOLDER%2FFILE.json"而不是/FOLDER/FILE.json"

I suspect the error is caused due to "/" being converted to "%2F" by some internal function as the error shows '/FOLDER%2FFILE.json' instead of '/FOLDER/FILE.json'

推荐答案

如果你不告诉你的 spark (jvm) 应用程序无法读取环境变量，那么快速解决:

Your spark (jvm) application cannot read environment variable if you don't tell it to, so a quick work around :

spark.sparkContext
     .hadoopConfiguration.set("fs.s3n.awsAccessKeyId", awsAccessKeyId)
spark.sparkContext
     .hadoopConfiguration.set("fs.s3n.awsSecretAccessKey", awsSecretAccessKey)

您还需要精确的 s3 端点:

You'll also need to precise the s3 endpoint :

spark.sparkContext
     .hadoopConfiguration.set("fs.s3a.endpoint", "<<ENDPOINT>>");

要了解有关什么是 AWS S3 Endpoint 的更多信息，请参阅以下文档:

To know more about what is AWS S3 Endpoint, refer to the following documentation :

AWS 区域和终端节点.

使用 Amazon S3 存储桶.

这篇关于无法使用 spark 从 s3 存储桶中读取数据的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

无法使用 spark 从 s3 存储桶中读取数据 [英] Unable to read from s3 bucket using spark

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

无法使用 spark 从 s3 存储桶中读取数据 [英] Unable to read from s3 bucket using spark

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭