AWS EKS Spark 3.0,Hadoop 3.2错误-NoClassDefFoundError:com/amazonaws/services/s3/model/MultiObjectDeleteException [英] AWS EKS Spark 3.0, Hadoop 3.2 Error - NoClassDefFoundError: com/amazonaws/services/s3/model/MultiObjectDeleteException

查看:263
本文介绍了AWS EKS Spark 3.0,Hadoop 3.2错误-NoClassDefFoundError:com/amazonaws/services/s3/model/MultiObjectDeleteException的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在EKS上运行Jupyterhub,并希望利用EKS IRSA功能在K8上运行Spark工作负载.我以前有使用Kube2IAM的经验,但是现在我打算迁移到IRSA.

I'm running Jupyterhub on EKS and wants to leverage EKS IRSA functionalities to run Spark workloads on K8s. I had prior experience of using Kube2IAM, however now I'm planning to move to IRSA.

此错误不是由于IRSA引起的,因为服务帐户已完全连接到Driver和Executor吊舱,并且我可以通过CLI和SDK从这两者访问S3.此问题与在Spark 3.0/Hadoop 3.2上使用Spark访问S3有关.

This error is not because of IRSA, as service accounts are getting attached perfectly fine to Driver and Executor pods and I can access S3 via CLI and SDK from both. This issue is related to accessing S3 using Spark on Spark 3.0/ Hadoop 3.2

Py4JJavaError:调用None.org.apache.spark.api.java.JavaSparkContext时发生错误. :java.lang.NoClassDefFoundError:com/amazonaws/services/s3/model/MultiObjectDeleteException

Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext. : java.lang.NoClassDefFoundError: com/amazonaws/services/s3/model/MultiObjectDeleteException

我正在使用以下版本-

  • APACHE_SPARK_VERSION = 3.0.1
  • HADOOP_VERSION = 3.2
  • aws-java-sdk-1.11.890
  • hadoop-aws-3.2.0
  • Python 3.7.3

我也测试了不同版本.

  • aws-java-sdk-1.11.563.jar

如果有人遇到此问题,请提供解决方案.

Please help to give a solution if someone has come across this issue.

PS:这也不是IAM策略错误,因为IAM策略非常好.

PS: This is not an IAM Policy error as well, because IAM policies are perfectly fine.

推荐答案

可以查看此博客(

Can check out this blog (https://medium.com/swlh/how-to-perform-a-spark-submit-to-amazon-eks-cluster-with-irsa-50af9b26cae) with:

  • 火花2.4.4
  • Hadoop 2.7.3
  • AWS SDK 1.11.834

示例spark-submit是

The example spark-submit is

/opt/spark/bin/spark-submit \
    --master=k8s://https://4A5<i_am_tu>545E6.sk1.ap-southeast-1.eks.amazonaws.com \
    --deploy-mode cluster \
    --name spark-pi \
    --class org.apache.spark.examples.SparkPi \
    --conf spark.kubernetes.driver.pod.name=spark-pi-driver \
    --conf spark.kubernetes.container.image=vitamingaugau/spark:spark-2.4.4-irsa \
    --conf spark.kubernetes.namespace=spark-pi \
    --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark-pi \
    --conf spark.kubernetes.authenticate.executor.serviceAccountName=spark-pi \
    --conf spark.hadoop.fs.s3a.aws.credentials.provider=com.amazonaws.auth.WebIdentityTokenCredentialsProvider \
    --conf spark.kubernetes.authenticate.submission.caCertFile=/var/run/secrets/kubernetes.io/serviceaccount/ca.crt \
    --conf spark.kubernetes.authenticate.submission.oauthTokenFile=/var/run/secrets/kubernetes.io/serviceaccount/token \
    local:///opt/spark/examples/target/scala-2.11/jars/spark-examples_2.11-2.4.4.jar 20000

这篇关于AWS EKS Spark 3.0,Hadoop 3.2错误-NoClassDefFoundError:com/amazonaws/services/s3/model/MultiObjectDeleteException的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆