如何从容器内部获取YARN ContainerId? [英] How do I get the YARN ContainerId from inside the container?
本文介绍了如何从容器内部获取YARN ContainerId?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我正在YARN上运行Spark作业,并希望获取YARN容器ID(这是在一组Spark作业中生成唯一ID的要求的一部分).我可以看到
I'm running a Spark job on YARN and would like to get the YARN container ID (as part of a requirement to generate unique IDs across a set of Spark jobs). I can see the Container.getId() method to get the ContainerId but no idea how to get a reference to the current running container from YARN. Is this even possible? How does a YARN container get it's own information?
推荐答案
我能得到的唯一方法是使用日志记录目录.以下内容可在Spark Shell中使用.
The only way that I could get something was to use the logging directory. The following works in a spark shell.
import org.apache.hadoop.yarn.api.records.ContainerId
def f(): String = {
val localLogDir: String = System.getProperty("spark.yarn.app.container.log.dir")
val containerIdString: String = localLogDir.split("/").last
val containerIdLong: Long = ContainerId.fromString(containerIdString).getContainerId
containerIdLong.toHexString
}
val rdd1 = sc.parallelize((1 to 10)).map{ _ => f() }
rdd1.distinct.collect().foreach(println)
这篇关于如何从容器内部获取YARN ContainerId?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文