Java如何在Intellij中找到spark,Hadoop和AWS jar [英] How does Java find spark, Hadoop and AWS jars in Intellij

查看:61
本文介绍了Java如何在Intellij中找到spark,Hadoop和AWS jar的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在Java中的IntelliJ上运行spark应用程序.我在pom.xml中添加了spark,Hadoop和AWS依赖项,但是不知何故未加载AWS凭证.

I am running spark application on IntelliJ in Java. I have added spark, Hadoop and AWS dependencies in pom.xml but somehow AWS credentials are not being loaded.

我得到的确切错误是Caused by: com.amazonaws.AmazonClientException: No AWS Credentials provided by BasicAWSCredentialsProvider EnvironmentVariableCredentialsProvider InstanceProfileCredentialsProvider : com.amazonaws.SdkClientException: Unable to load credentials from service endpoint

下面是我的.java和pom.xml文件.

Below are my .java and pom.xml files.

SparkSession spark  = SparkSession
                    .builder()
                    .master("local") .config("spark.hadoop.fs.s3a.impl","org.apache.hadoop.fs.s3a.S3AFileSystem")            .config("spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version", "2")
                    .config("spark.hadoop.fs.s3a.awsAccessKeyId", AWS_KEY)
                    .config("spark.hadoop.fs.s3a.awsSecretAccessKey", AWS_SECRET_KEY)
                    .getOrCreate();

            JavaSparkContext sc = new JavaSparkContext(spark.sparkContext());
            Dataset<Row> dF = spark.read().load("s3a://bucket/abc.parquet");

这是我的pom.xml

Here is my pom.xml

<dependencies>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-core_2.11</artifactId>
        <version>2.3.2</version>
    </dependency>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-sql_2.11</artifactId>
        <version>2.3.2</version>
    </dependency>
    <dependency>
        <groupId>com.amazonaws</groupId>
        <artifactId>aws-java-sdk</artifactId>
        <version>1.11.417</version>
    </dependency>
    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-aws</artifactId>
        <version>3.1.1</version>
    </dependency>
    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-hdfs</artifactId>
        <version>3.1.1</version>
    </dependency>
    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-common</artifactId>
        <version>3.1.1</version>
    </dependency>
</dependencies>

我在其中停留了一段时间,并尝试了所有可用的解决方案.我在环境中添加了导出AWS密钥.

I am stuck in this for a while and tried all available solutions. I added export AWS keys in my environment.

考虑到没有像python或Scala这样的Java spark shell,并且只有pom.xml是唯一的方法,您是否还可以通过其他方式为Java指定jar或键?

Is there any other way you specify jars or keys for java, considering there is no java spark shell like python or Scala and pom.xml is the only way?

推荐答案

发现您只需要在SparkContext中添加AWS凭证,而不必在SparkSession中添加AWS凭证.

Found that you have to add AWS creds in SparkContext only and not SparkSession.

JavaSparkContext sc = new JavaSparkContext(spark.sparkContext());
sc.hadoopConfiguration().set("fs.s3a.access.key", AWS_KEY);
sc.hadoopConfiguration().set("fs.s3a.secret.key", AWS_SECRET_KEY);

这篇关于Java如何在Intellij中找到spark,Hadoop和AWS jar的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆