当在Kubernetes上运行Spark以访问以kerberized方式存储的Hadoop集群时，如何解决“未启用SIMPLE身份验证"的问题?执行者出错? [英] When running Spark on Kubernetes to access kerberized Hadoop cluster, how do you resolve a "SIMPLE authentication is not enabled" error on executors?

查看：304 发布时间：2020/4/25 11:09:35 docker apache-spark kubernetes openshift kerberos

本文介绍了当在Kubernetes上运行Spark以访问以kerberized方式存储的Hadoop集群时，如何解决“未启用SIMPLE身份验证"的问题?执行者出错?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我试图在Kubernetes上运行Spark，以处理来自Kerberized Hadoop集群的数据.我的应用程序由简单的SparkSQL转换组成.虽然我可以在单个驱动程序窗格上成功运行该过程，但是在尝试使用任何执行程序时却无法执行此操作.相反，我得到:

I'm trying to run Spark on Kubernetes, with the aim of processing data from a Kerberized Hadoop cluster. My application consists of simple SparkSQL transformations. While I'm able to run the process successfully on a single driver pod, I cannot do this when attempting to use any executors. Instead, I get:

org.apache.hadoop.security.AccessControlException:简单未启用身份验证.可用:[TOKEN，KERBEROS]

org.apache.hadoop.security.AccessControlException: SIMPLE authentication is not enabled. Available:[TOKEN, KERBEROS]

由于Hadoop环境是Kerberized，因此我提供了有效的密钥表，以及core-site.xml，hive-site.xml，hadoop-site.xml，mapred-site.xml和yarn-site. xml，以及docker映像中的krb5.conf文件.

Since the Hadoop environment is Kerberized, I've provided a valid keytab, as well as the core-site.xml, hive-site.xml, hadoop-site.xml, mapred-site.xml and yarn-site.xml, and a krb5.conf file inside the docker image.

我通过以下方法设置环境设置:

I set up the environment settings with the following method:

trait EnvironmentConfiguration {

def configureEnvironment(): Unit = {
  val conf = new Configuration
  conf.set("hadoop.security.authentication", "kerberos")
  conf.set("hadoop.security.authorization", "true")
  conf.set("com.sun.security.auth.module.Krb5LoginModule", "required")
  System.setProperty("java.security.krb5.conf", ConfigurationProperties.kerberosConfLocation)    
  UserGroupInformation.loginUserFromKeytab(ConfigurationProperties.keytabUser, ConfigurationProperties.keytabLocation)
  UserGroupInformation.setConfiguration(conf)
}

我还通过以下方法传递* -site.xml文件:

I also pass the *-site.xml files through the following method:

trait SparkConfiguration {

  def createSparkSession(): SparkSession = {
    val spark = SparkSession.builder
    .appName("MiniSparkK8")
    .enableHiveSupport()
    .master("local[*]")
    .config("spark.sql.hive.metastore.version", ConfigurationProperties.hiveMetastoreVersion)
    .config("spark.executor.memory", ConfigurationProperties.sparkExecutorMemory)
    .config("spark.sql.hive.version", ConfigurationProperties.hiveVersion)
    .config("spark.sql.hive.metastore.jars",ConfigurationProperties.hiveMetastoreJars)
    spark.sparkContext.hadoopConfiguration.addResource(new Path(ConfigurationProperties.coreSiteLocation))
    spark.sparkContext.hadoopConfiguration.addResource(new Path(ConfigurationProperties.hiveSiteLocation))
    spark.sparkContext.hadoopConfiguration.addResource(new Path(ConfigurationProperties.hdfsSiteLocation))
    spark.sparkContext.hadoopConfiguration.addResource(new Path(ConfigurationProperties.yarnSiteLocation))
    spark.sparkContext.hadoopConfiguration.addResource(new Path(ConfigurationProperties.mapredSiteLocation))
  }
}

我使用以下spark-submit命令运行整个过程:

I run the whole process with the following spark-submit command:

spark-submit ^
--master k8s://https://kubernetes.example.environment.url:8443 ^
--deploy-mode cluster ^
--name mini-spark-k8 ^
--class org.spark.Driver ^
--conf spark.executor.instances=2 ^
--conf spark.kubernetes.namespace=<company-openshift-namespace> ^
--conf spark.kubernetes.container.image=<company_image_registry.image> ^
--conf spark.kubernetes.driver.pod.name=minisparkk8-cluster ^
--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark ^
local:///opt/spark/examples/target/MiniSparkK8-1.0-SNAPSHOT.jar ^
/opt/spark/mini-spark-conf.properties

以上配置足以使我的spark应用程序运行并成功连接到Kerberized Hadoop集群.尽管spark Submit命令声明了两个执行程序pod的创建，但这不会发生，因为我已将master设置为local[*].因此，仅创建了一个Pod，该Pod可以连接到Kerberized Hadoop集群并成功在Hive表上运行我的Spark转换.

The above configurations are enough to get my spark application running and successfully connecting to the Kerberized Hadoop cluster. Although the spark submit command declares the creation of two executor pods, this does not happen because I have set master to local[*]. Consequently, only one pod is created which manages to connect to the Kerberized Hadoop cluster and successfully run my Spark transformations on Hive tables.

但是，当我删除.master(local[*])时，将创建两个执行程序容器.从日志中可以看到，这些执行程序已成功连接到驱动程序窗格，并且已分配了任务.在这之后不久，他们两个都因上述错误而失败，从而导致失败的执行程序容器被终止. 尽管执行者已经拥有所有必要的文件，才能在其映像中创建到Kerberized Hadoop的成功连接.我相信执行者没有使用keytab，如果他们正在运行JAR，他们会这样做.相反，它们正在运行从驱动程序分配给他们的任务.

However, when I remove .master(local[*]), two executor pods are created. I can see from the logs that these executors connecting successfully to the driver pod, and they are assigned tasks. It doesn't take long after this point for both of them to fail with the error mentioned above, resulting in the failed executor pods to be terminated. This is despite the executors already having all the necessary files to create a successful connection to the Kerberized Hadoop inside their image. I believe that the executors are not using the keytab, which they would be doing if they were running the JAR. Instead, they're running tasks given to them from the driver.

我从日志中看到，驱动程序设法使用用户USER123的密钥标签正确验证了自己的身份:

I can see from the logs that the driver manages to authenticate itself correctly with the keytab for user, USER123:

INFO SecurityManager:54-SecurityManager:身份验证已禁用； ui acls已禁用；具有查看权限的用户:Set(spark，USER123); 具有查看权限的组:Set();具有修改权限的用户: 设置(火花，USER123);具有修改权限的组:Set()

INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(spark, USER123); groups with view permissions: Set(); users with modify permissions: Set(spark, USER123); groups with modify permissions: Set()

另一方面，您从执行者的日志中获得以下信息，您可以看到该用户USER123未通过身份验证:

On the other hand, you get the following from the executor's log, you can see that user, USER123 is not authenticated:

INFO SecurityManager:54-SecurityManager:身份验证已禁用； ui acls已禁用；具有查看权限的用户:Set(spark);团体具有查看权限:Set();具有修改权限的用户: 套装(火花);具有修改权限的组:Set()

INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(spark); groups with view permissions: Set(); users with modify permissions: Set(spark); groups with modify permissions: Set()

我查看了各种来源，包括此处 .它提到需要定义HIVE_CONF_DIR，但是我从我的程序(打印环境变量)中可以看到该变量不存在，包括驱动程序pod设法对其自身进行身份验证并正常运行spark过程时.

I have looked at various sources, including here. It mentions that HIVE_CONF_DIR needs to be defined, but I can see from my program (which prints the environment variables) that this variable is not present, including when the driver pod manages to authenticate itself and run the spark process fine.

我尝试在先前的spark-submit命令中添加以下内容来运行:

I've tried running with the following added to the previous spark-submit command:

--conf spark.kubernetes.kerberos.enabled=true ^
--conf spark.kubernetes.kerberos.krb5.path=/etc/krb5.conf ^
--conf spark.kubernetes.kerberos.keytab=/var/keytabs/USER123.keytab ^
--conf spark.kubernetes.kerberos.principal=USER123@REALM ^

但这没什么区别.

我的问题是:如何让执行者使用其图像中的密钥表对自己进行身份验证?我希望这将使他们能够执行委派的任务.

My question is: how can I get the executors to authenticate themselves with the keytab they have in their image? I'm hoping this will allow them to perform their delegated tasks.

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭