Spark不会在地图功能内的控制台上打印输出 [英] Spark doesnt print outputs on the console within the map function
问题描述
我有一个在集群模式下运行的简单Spark应用程序.
I have a simple Spark application running on cluster mode.
val funcGSSNFilterHeader = (x: String) => {
println(!x.contains("servedMSISDN")
!x.contains("servedMSISDN")
}
val ssc = new StreamingContext(sc, Seconds(batchIntervalSeconds))
val ggsnFileLines = ssc.fileStream[LongWritable, Text, TextInputFormat]("C:\\Users\\Mbazarganigilani\\Documents\\RA\\GGSN\\Files1", filterF, false)
val ggsnArrays = ggsnFileLines
.map(x => x._2.toString()).filter(x => funcGSSNFilterHeader(x))
ggsnArrays.foreachRDD(s => {println(x.toString()})
我需要在map函数内打印!x.contains("servedMSISDN")以进行调试,但这不能在控制台上打印
I need to print !x.contains("servedMSISDN") inside the map function for debugging purposes, but this doesn't print on the console
推荐答案
您的代码包含驱动程序(主/主)和执行程序(以群集模式在节点上运行).
Your code contains driver (main/master) and executors (which runs on the nodes in cluster mode).
在地图"中运行的功能在执行程序上运行
Functions which runs inside a "map" runs on the executors
即当您处于集群模式时,在map函数内部执行打印将导致打印到节点控制台(您不会看到).
i.e. when you are in cluster mode, execution print inside map function will result in print to the nodes console (which you won't see).
为了调试程序,您可以:
In order to debug a program, you can:
-
在本地"模式下运行代码,由于执行程序在同一台计算机上运行,因此地图功能"中的打印内容将打印在主/主节点"的控制台上
Run the code in "local" mode, and the prints in the "map function" will be printed the console of your "master/main node" as the executors are running on the same machine
用保存到文件"/保存到弹性文件"等替换打印到控制台"
Replace "print to console" with save to file / save to elastic / etc
请注意,除了 local vs cluster 模式-您的代码中似乎有错字:
Note that in addition to the local vs cluster mode - It seems like you have a typo in your code:
ggsnArrays.foreachRDD(s => {println(x.toString()})
应该是:
ggsnArrays.foreachRDD(s => {println(x.toString)})
这篇关于Spark不会在地图功能内的控制台上打印输出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!