如何从容器内部获取YARN ContainerId? [英] How do I get the YARN ContainerId from inside the container?

查看:258
本文介绍了如何从容器内部获取YARN ContainerId?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在YARN上运行Spark作业,并希望获取YARN容器ID(这是在一组Spark作业中生成唯一ID的要求的一部分).我可以看到

I'm running a Spark job on YARN and would like to get the YARN container ID (as part of a requirement to generate unique IDs across a set of Spark jobs). I can see the Container.getId() method to get the ContainerId but no idea how to get a reference to the current running container from YARN. Is this even possible? How does a YARN container get it's own information?

推荐答案

我能得到的唯一方法是使用日志记录目录.以下内容可在Spark Shell中使用.

The only way that I could get something was to use the logging directory. The following works in a spark shell.

import org.apache.hadoop.yarn.api.records.ContainerId

def f(): String = {
  val localLogDir: String = System.getProperty("spark.yarn.app.container.log.dir")
  val containerIdString: String = localLogDir.split("/").last
  val containerIdLong: Long = ContainerId.fromString(containerIdString).getContainerId
  containerIdLong.toHexString
}

val rdd1 = sc.parallelize((1 to 10)).map{ _ => f() }
rdd1.distinct.collect().foreach(println)

这篇关于如何从容器内部获取YARN ContainerId?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆